NFDI4DS | UHH-SEMS - Publication Details

Heming Cui

ORCID: 0000-0001-7746-440X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5015360935

Research Areas

Distributed systems and fault tolerance
Software Testing and Debugging Techniques
Parallel Computing and Optimization Techniques
Cloud Computing and Resource Management
Adversarial Robustness in Machine Learning
Advanced Data Storage Technologies
Security and Verification in Computing
Advanced Malware Detection Techniques
Mobile Ad Hoc Networks
Blockchain Technology Applications and Security
Advanced Neural Network Applications
Robotic Path Planning Algorithms
Software Engineering Research
Software System Performance and Reliability
Robotics and Sensor-Based Localization
Distributed and Parallel Computing Systems
Cloud Data Security Solutions
Wireless Networks and Protocols
Software Reliability and Analysis Research
Energy Efficient Wireless Sensor Networks
Robotics and Automated Systems
Caching and Content Delivery
Embedded Systems Design Techniques
Advanced Memory and Neural Computing
Domain Adaptation and Few-Shot Learning

University of Hong Kong
2016-2025

Shanghai Artificial Intelligence Laboratory
2023-2024

Chinese University of Hong Kong
2018-2023

Beijing Academy of Artificial Intelligence
2023

Shanghai Zhangjiang Laboratory
2022

Columbia University
2010-2014

Tsinghua University
2008

Parrot

OPENALEX - Publications

Heming Cui Jiřı Šimša Yihong Lin Hao Li Ben Blum and 4 more

Multithreaded programs are hard to get right. A key reason is that the contract between developers and runtimes grants exponentially many schedules runtimes. We present Parrot, a simple, practical runtime with new developers. By default, it orders thread synchronizations in well-defined round-robin order, vastly reducing provide determinism (more precisely, deterministic synchronizations) stability (i.e., robustness against input or code perturbations, more useful property than determinism)....

10.1145/2517349.2522735 article EN 2013-10-08

One fuzzing strategy to rule them all

OPENALEX - Publications

Mingyuan Wu Ling Jiang Jiahong Xiang Yanwei Huang Heming Cui and 2 more

Coverage-guided fuzzing has become mainstream in to automatically expose program vulnerabilities. Recently, a group of fuzzers are proposed adopt random search mechanism namely Havoc, explicitly or implicitly, augment their edge exploration. However, they only tend the default setup Havoc as an implementation option while none them attempts explore its power under diverse setups inspect rationale for potential improvement. In this paper, address such issues, we conduct first empirical study...

10.1145/3510003.3510174 article EN Proceedings of the 44th International Conference on Software Engineering 2022-05-21

Efficient deterministic multithreading through schedule relaxation

OPENALEX - Publications

Heming Cui Jingyue Wu John P. Gallagher Huayang Guo Junfeng Yang

Deterministic multithreading (DMT) eliminates many pernicious software problems caused by nondeterminism. It works constraining a program to repeat the same thread interleavings, or schedules, when given input. Despite much recent research, it remains an open challenge build both deterministic and efficient DMT systems for general programs on commodity hardware. To deterministically resolve data race, system must enforce schedule of shared memory accesses, mem-schedule, which can incur...

10.1145/2043556.2043588 article EN 2011-10-23

Stable deterministic multithreading through schedule memoization

OPENALEX - Publications

Heming Cui Jingyue Wu Chia-Che Tsai Junfeng Yang

A deterministic multithreading (DMT) system eliminates nondeterminism in thread scheduling, simplifying the development of multithreaded programs. However, existing DMT systems are unstable; they may force a program to (ad)venture into vastly different schedules even for slightly inputs or execution environments, defeating many benefits determinism. Moreover, few work with server programs whose arrive continuously and nondeterministically.TERN is stable system. The key novelty TERN idea...

10.5555/1924943.1924958 article EN Operating Systems Design and Implementation 2010-10-04

APUS

OPENALEX - Publications

Cheng Wang Jianyu Jiang Xusheng Chen Ning Yi Heming Cui

State machine replication (SMR) uses Paxos to enforce the same inputs for a program (e.g., Redis) replicated on number of hosts, tolerating various types failures. Unfortunately, traditional protocols incur prohibitive performance overhead server programs due their high consensus latency TCP/IP. Worse, extant increases drastically when more concurrent client connections or hosts are added. This paper presents APUS, first RDMA-based protocol that aims be fast and scalable hosts. APUS...

10.1145/3127479.3128609 article EN 2017-09-24

Verifying systems rules using rule-directed symbolic execution

OPENALEX - Publications

Heming Cui Gang Hu Jingyue Wu Junfeng Yang

Systems code must obey many rules, such as "opened files be closed." One approach to verifying rules is static analysis, but this technique cannot infer precise runtime effects of code, often emitting false positives. An alternative symbolic execution, a that verifies program paths over all inputs up bounded size. However, when applied verify existing execution systems blindly explore redundant while missing relevant ones may contain bugs.

10.1145/2451116.2451152 article EN 2013-03-16

Rethinking Adversarial Attacks in Reinforcement Learning from Policy Distribution Perspective

OPENALEX - Publications

Tianyang Duan Zongyuan Zhang Lin Zheng Yue Gao Ling Xiong and 5 more

Deep Reinforcement Learning (DRL) suffers from uncertainties and inaccuracies in the observation signal realworld applications. Adversarial attack is an effective method for evaluating robustness of DRL agents. However, existing methods targeting individual sampled actions have limited impacts on overall policy distribution, particularly continuous action spaces. To address these limitations, we propose Distribution-Aware Projected Gradient Descent (DAPGD). DAPGD uses distribution similarity...

10.48550/arxiv.2501.03562 preprint EN arXiv (Cornell University) 2025-01-07

Hecate: Unlocking Efficient Sparse Model Training via Fully Sharded Sparse Data Parallelism

OPENALEX - Publications

Yuhao Qing Guannan Zhu Fanxin Li Lei Liang Z. T. Sun and 6 more

Mixture-of-Experts (MoE) has emerged as a promising sparse paradigm for scaling up pre-trained models (PTMs) with remarkable cost-effectiveness. However, the dynamic nature of MoE leads to rapid fluctuations and imbalances in expert loads during training, resulting significant straggler effects that hinder training performance when using parallelism (EP). Existing systems attempt mitigate these through rearrangement strategies, but they face challenges terms memory efficiency timeliness...

10.48550/arxiv.2502.02581 preprint EN arXiv (Cornell University) 2025-02-04

Rethinking Adversarial Attacks in Reinforcement Learning from Policy Distribution Perspective

OPENALEX - Publications

Tianyang Duan Zongyuan Zhang Lin Zheng Yue Gao Ling Xiong and 5 more

10.1109/icassp49660.2025.10890540 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

Bias Testing and Mitigation in LLM-based Code Generation

OPENALEX - Publications

Dong Huang Jie M. Zhang Qingwen Bu Xiaofei Xie Junjie Chen and 1 more

As the adoption of LLMs becomes more widespread in software coding ecosystems, a pressing issue has emerged: does generated code contain social bias and unfairness, such as those related to age, gender, race? This concerns integrity, fairness, ethical foundation applications that depend on by these models but are underexplored literature. paper presents novel testing framework is specifically designed for generation tasks. Based this framework, we conduct an extensive empirical study biases...

10.1145/3724117 article EN ACM Transactions on Software Engineering and Methodology 2025-03-18

State-Aware Perturbation Optimization for Robust Deep Reinforcement Learning

OPENALEX - Publications

Zongyuan Zhang Tianyang Duan Zheng Lin Dong Huang Zihan Fang and 5 more

10.36227/techrxiv.174320032.22893766/v1 preprint EN cc-by 2025-03-28

Robust Deep Reinforcement Learning in Robotics via Adaptive Gradient-Masked Adversarial Attacks

OPENALEX - Publications

Zongyuan Zhang Tianyang Duan Zheng Lin Dong Huang Zihan Fang and 6 more

10.36227/techrxiv.174320034.48055861/v1 preprint EN cc-by 2025-03-28

State-Aware Perturbation Optimization for Robust Deep Reinforcement Learning

OPENALEX - Publications

Zongyuan Zhang Tianyang Duan Zheng Lin Dong Huang Zihan Fang and 5 more

Recently, deep reinforcement learning (DRL) has emerged as a promising approach for robotic control. However, the deployment of DRL in real-world robots is hindered by its sensitivity to environmental perturbations. While existing whitebox adversarial attacks rely on local gradient information and apply uniform perturbations across all states evaluate robustness, they fail account temporal dynamics state-specific vulnerabilities. To combat above challenge, we first conduct theoretical...

10.48550/arxiv.2503.20613 preprint EN arXiv (Cornell University) 2025-03-26

Robust Deep Reinforcement Learning in Robotics via Adaptive Gradient-Masked Adversarial Attacks

OPENALEX - Publications

Zongyuan Zhang Tianyang Duan Zheng Lin Dong Huang Zihan Fang and 6 more

Deep reinforcement learning (DRL) has emerged as a promising approach for robotic control, but its realworld deployment remains challenging due to vulnerability environmental perturbations. Existing white-box adversarial attack methods, adapted from supervised learning, fail effectively target DRL agents they overlook temporal dynamics and indiscriminately perturb all state dimensions, limiting their impact on long-term rewards. To address these challenges, we propose the Adaptive...

10.48550/arxiv.2503.20844 preprint EN arXiv (Cornell University) 2025-03-26

P axos made transparent

OPENALEX - Publications

Heming Cui Rui Gu Cheng Liu Tianyu Chen Junfeng Yang

State machine replication (SMR) leverages distributed consensus protocols such as Paxos to keep multiple replicas of a program consistent in face replica failures or network partitions. This fault tolerance is enticing on implementing principled SMR system that replicates general programs, especially server programs demand high availability. Unfortunately, assumes deterministic execution, but most are multithreaded and thus nondeterministic. Moreover, existing systems provide narrow state...

10.1145/2815400.2815427 article EN 2015-10-01

vPipe: A Virtualized Acceleration System for Achieving Efficient and Scalable Pipeline Parallel DNN Training

OPENALEX - Publications

Shixiong Zhao Fanxin Li Xusheng Chen Xiuxian Guan Jianyu Jiang and 8 more

The increasing computational complexity of DNNs achieved unprecedented successes in various areas such as machine vision and natural language processing (NLP), e.g., the recent advanced Transformer has billions parameters. However, large-scale significantly exceed GPU's physical memory limit, they cannot be trained by conventional methods data parallelism. Pipeline parallelism that partitions a large DNN into small subnets trains them on different GPUs is plausible solution. Unfortunately,...

10.1109/tpds.2021.3094364 article EN cc-by-nc-nd IEEE Transactions on Parallel and Distributed Systems 2021-07-02

JITfuzz: Coverage-guided Fuzzing for JVM Just-in-Time Compilers

OPENALEX - Publications

Mingyuan Wu Minghai Lu Heming Cui Junjie Chen Yuqun Zhang and 1 more

As a widely-used platform to support various Java-bytecode-based applications, Java Virtual Machine (JVM) incurs severe performance loss caused by its real-time program interpretation mechanism. To tackle this issue, the Just-in- Time compiler (JIT) has been widely adopted strengthen efficacy of JVM. Therefore, how effectively and efficiently detect JIT bugs becomes critical ensure correctness In paper, we propose coverage-guided fuzzing framework, namely JITfuzz, automatically bugs....

10.1109/icse48619.2023.00017 article EN 2023-05-01

Bypassing races in live applications with execution filters

OPENALEX - Publications

Jingyue Wu Heming Cui Junfeng Yang

Deployed multithreaded applications contain many races because these are difficult to write, test, and debug. Worse, the number of in deployed may drastically increase due rise multicore hardware immaturity current race detectors.LOOM is a live-workaround system designed quickly safely bypass application at runtime. LOOM provides flexible safe language for developers write execution filters that explicitly synchronize code. It then uses an evacuation algorithm install live avoid races....

10.5555/1924943.1924953 article EN Operating Systems Design and Implementation 2010-10-04

Bidl

OPENALEX - Publications

Ji Qi Xusheng Chen Yunpeng Jiang Jianyu Jiang Tianxiang Shen and 6 more

A permissioned blockchain framework typically runs an efficient Byzantine consensus protocol and is attractive to deploy fast trading applications among a large number of mutually untrusted participants (e.g., companies). Unfortunately, all existing frameworks adopt sequential workflows for invoking the executing applications' transactions, making performance these much lower than deploying them in traditional systems in-datacenter stock exchange).

10.1145/3477132.3483574 article EN 2021-10-19

Evaluating and improving neural program-smoothing-based fuzzing

OPENALEX - Publications

Mingyuan Wu Ling Jiang Jiahong Xiang Yuqun Zhang Guowei Yang and 5 more

Fuzzing nowadays has been commonly modeled as an optimization problem, e.g., maximizing code coverage under a given time budget via typical search-based solutions such evolutionary algorithms. However, are widely argued to cause inefficient computing resource usage, i.e., mutations. To address this issue, two neural program-smoothing-based fuzzers, Neuzz and MTFuzz, have recently proposed approximate program branching behaviors network models, which input byte sequences of seed output...

10.1145/3510003.3510089 article EN Proceedings of the 44th International Conference on Software Engineering 2022-05-21

Sound and precise analysis of parallel programs through schedule specialization

OPENALEX - Publications

Jingyue Wu Yang Tang Gang Hu Heming Cui Junfeng Yang

Parallel programs are known to be difficult analyze. A key reason is that they typically have an enormous number of execution interleavings, or schedules. Static analysis over all schedules requires over-approximations, resulting in poor precision; dynamic rarely covers more than a tiny fraction We propose approach called schedule specialization analyze parallel program only small set for precision, and then enforce these at runtime soundness the static results. build framework C/C++...

10.1145/2254064.2254090 article EN 2012-06-11

Making parallel programs reliable with stable multithreading

OPENALEX - Publications

Junfeng Yang Heming Cui Jingyue Wu Yang Tang Gang Hu

Stable multithreading dramatically simplifies the interleaving behaviors of parallel programs, offering new hope for making programming easier.

10.1145/2500875 article EN Communications of the ACM 2014-02-26

EffiBench: Benchmarking the Efficiency of Automatically Generated Code

OPENALEX - Publications

Dong Huang Jie M. Zhang Yuhao Qing Heming Cui

Code generation models have increasingly become integral to aiding software development, offering assistance in tasks such as code completion, debugging, and translation. Although current research has thoroughly examined the correctness of produced by models, a vital aspect, i.e., efficiency generated code, often been neglected. This paper presents EffiBench, benchmark with 1,000 efficiency-critical coding problems for assessing models. EffiBench contains diverse set LeetCode problems. Each...

10.48550/arxiv.2402.02037 preprint EN arXiv (Cornell University) 2024-02-03

Achieving low tail-latency and high scalability for serializable transactions in edge computing

OPENALEX - Publications

Xusheng Chen Haoze Song Jianyu Jiang Chaoyi Ruan Cheng Li and 4 more

A distributed database utilizing the wide-spread edge computing servers to provide low-latency data access with serializability guarantee is highly desirable for emerging applications. In an database, nodes are divided into regions, and a transaction can be categorized as intra-region (IRT) or cross-region (CRT) based on whether it accesses in different regions. addition serializability, we insist that practical should low tail latency both IRTs CRTs, such must scalable large number of...

10.1145/3447786.3456238 article EN 2021-04-21

OWL: Understanding and Detecting Concurrency Attacks

OPENALEX - Publications

Shixiong Zhao Rui Gu Haoran Qiu Tsz On Li Yuexuan Wang and 2 more

Just like bugs in single-threaded programs can lead to vulnerabilities, multithreaded also concurrency attacks. We studied 31 real-world attacks, including privilege escalations, hijacking code executions, and bypassing security checks. found that compared bugs' traditional consequences (e.g., program crashes), attacks' are often implicit, extremely hard be observed diagnosed by developers. Moreover, addition bug-inducing inputs, extra subtle inputs needed trigger the These features make...

10.1109/dsn.2018.00033 article EN 2018-06-01

Coming Soon ...