NFDI4DS | UHH-SEMS - Publication Details

Dazhao Cheng

ORCID: 0000-0003-2869-7623

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5063911669

Research Areas

Cloud Computing and Resource Management
IoT and Edge/Fog Computing
Caching and Content Delivery
Advanced Neural Network Applications
Parallel Computing and Optimization Techniques
Privacy-Preserving Technologies in Data
Advanced Data Storage Technologies
Software-Defined Networks and 5G
Topic Modeling
Stochastic Gradient Optimization Techniques
Blockchain Technology Applications and Security
Software System Performance and Reliability
Graph Theory and Algorithms
Brain Tumor Detection and Classification
Nuclear Materials and Properties
Cryptography and Data Security
Generative Adversarial Networks and Image Synthesis
Distributed and Parallel Computing Systems
Distributed systems and fault tolerance
Data Stream Mining Techniques
Natural Language Processing Techniques
Nuclear reactor physics and engineering
Privacy, Security, and Data Protection
Green IT and Sustainability
Advanced Graph Neural Networks

Wuhan University
2022-2025

Northwestern Polytechnical University
2024

University of North Carolina at Charlotte
2016-2021

Weatherford College
2021

Flint Institute Of Arts
2021

University of Colorado Colorado Springs
2013-2016

Improving Performance of Heterogeneous MapReduce Clusters with Adaptive Task Tuning

OPENALEX - Publications

Dazhao Cheng Jia Rao Yanfei Guo Changjun Jiang Xiaobo Zhou

Datacenter-scale clusters are evolving toward heterogeneous hardware architectures due to continuous server replacement. Meanwhile, datacenters commonly shared by many users for quite different uses. It often exhibits significant performance heterogeneity multi-tenant interferences. The deployment of MapReduce on such presents challenges in achieving good application compared in-house dedicated clusters. As most implementations originally designed homogeneous environments, can cause...

10.1109/tpds.2016.2594765 article EN publisher-specific-oa IEEE Transactions on Parallel and Distributed Systems 2016-07-27

iShuffle: Improving Hadoop Performance with Shuffle-on-Write

OPENALEX - Publications

Yanfei Guo Jia Rao Dazhao Cheng Xiaobo Zhou

Hadoop is a popular implementation of the MapReduce framework for running data-intensive jobs on clusters commodity servers. <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Shuffle</i> , all-to-all input data fetching phase between map and reduce can significantly affect job performance. However, shuffle are coupled together in only be performed by tasks. This leaves potential parallelism multiple waves unexploited resource wastage...

10.1109/tpds.2016.2587645 article EN publisher-specific-oa IEEE Transactions on Parallel and Distributed Systems 2016-07-07

Improving MapReduce performance in heterogeneous environments with adaptive task tuning

OPENALEX - Publications

Dazhao Cheng Jia Rao Yanfei Guo Xiaobo Zhou

The deployment of MapReduce in datacenters and clouds present several challenges achieving good job performance. Compared to in-house dedicated clusters, often exhibit significant hardware performance heterogeneity due continuous server replacement multi-tenant interferences. As most Mapreduce implementations assume homogeneous can cause load imbalance task execution, leading poor low cluster utilizations. Despite existing optimizations on scheduling balancing, still performs poorly...

10.1145/2663165.2666089 article EN 2014-01-01

Resource and Deadline-Aware Job Scheduling in Dynamic Hadoop Clusters

OPENALEX - Publications

Dazhao Cheng Jia Rao Changjun Jiang Xiaobo Zhou

As Hadoop is becoming increasingly popular in large-scale data analysis, there a growing need for providing predictable services to users who have strict requirements on job completion times. While earliest deadline first scheduling (EDF) like algorithms are guaranteeing deadlines real-time systems, they not effective dynamic environment, i.e., cluster with dynamically available resources. number of clusters deployed hybrid e.g., infrastructure powered by mix traditional and renewable...

10.1109/ipdps.2015.36 article EN 2015-05-01

Energy Efficiency Aware Task Assignment with DVFS in Heterogeneous Hadoop Clusters

OPENALEX - Publications

Dazhao Cheng Xiaobo Zhou Palden Lama Mike Ji Changjun Jiang

While Hadoop ecosystems become increasingly important for practitioners of large-scale data analysis, they also incur tremendous energy cost. This trend is driving up the need designing energy-efficient clusters in order to reduce operational costs and carbon emission associated with its consumption. However, despite extensive studies problem, existing approaches efficiency have not fully considered heterogeneity both workload machine hardware found production environments. In this paper, we...

10.1109/tpds.2017.2745571 article EN publisher-specific-oa IEEE Transactions on Parallel and Distributed Systems 2017-08-28

Cross-Platform Resource Scheduling for Spark and MapReduce on YARN

OPENALEX - Publications

Dazhao Cheng Xiaobo Zhou Palden Lama Jun Wu Changjun Jiang

While MapReduce is inherently designed for batch and high throughput processing workloads, there an increasing demand non-batch processes on big data, e.g., interactive jobs, real-time queries, stream computations. Emerging Apache Spark fills in this gap, which can run established Hadoop cluster take advantages of existing HDFS. As a result, the deployment model Spark-on-YARN widely applied by many industry leaders. However, we identify three key challenges to deploy YARN, inflexible...

10.1109/tc.2017.2669964 article EN publisher-specific-oa IEEE Transactions on Computers 2017-02-15

DNN Surgery: Accelerating DNN Inference on the Edge Through Layer Partitioning

OPENALEX - Publications

Huanghuang Liang Qianlong Sang Chuang Hu Dazhao Cheng Xiaobo Zhou and 3 more

Recent advances in deep neural networks have substantially improved the accuracy and speed of various intelligent applications. Nevertheless, one obstacle is that DNN inference imposes a heavy computation burden on end devices, but offloading tasks to cloud causes large volume data transmission. Motivated by fact size some intermediate layers significantly smaller than raw input data, we designed surgery, which allows partitioned be processed at both edge while limiting The challenge...

10.1109/tcc.2023.3258982 article EN IEEE Transactions on Cloud Computing 2023-03-20

Phase field simulation of pore morphological reconstruction and trailing bubbles in UO2 during pore migration

OPENALEX - Publications

Caiyan Liu Yunpeng Zhang Dazhao Cheng Yu Kang Changqing Teng and 2 more

10.1016/j.nucengdes.2024.113321 article EN Nuclear Engineering and Design 2024-05-30

Adaptive scheduling of parallel jobs in spark streaming

OPENALEX - Publications

Dazhao Cheng Yuan Chen Xiaobo Zhou Daniel Gmach Dejan Milojičić

Streaming data analytics has become increasingly vital in many applications such as dynamic content delivery (e.g., advertisements), Twitter sentiment analysis, and security event processing intrusion detection systems, spam filters). Emerging stream Spark Streaming, treat the continuous a series of micro-batches continuously process these micro-batch jobs. Such based provides several advantages over traditional which streaming one record at time, including fast recovery from failures,...

10.1109/infocom.2017.8057206 article EN IEEE INFOCOM 2022 - IEEE Conference on Computer Communications 2017-05-01

Adaptive Scheduling Parallel Jobs with Dynamic Batching in Spark Streaming

OPENALEX - Publications

Dazhao Cheng Xiaobo Zhou Yu Wang Changjun Jiang

Today enterprises have massive stream data that require to be processed in real time due explosion recent years. Spark Streaming as an emerging system is developed process analytics by using micro-batch approach. The unified programming model of Steaming leads some unique benefits over other traditional streaming systems, such fast recovery from failures, better load balancing and resource usage. It treats the continuous a series micro-batches continuously these jobs. However, efficient...

10.1109/tpds.2018.2846234 article EN IEEE Transactions on Parallel and Distributed Systems 2018-06-12

Tackling Cold Start of Serverless Applications by Efficient and Adaptive Container Runtime Reusing

OPENALEX - Publications

Kun Suo Junggab Son Dazhao Cheng Wei Chen Sabur Baidya

During the past few years, serverless computing has changed paradigm of application development and deployment in cloud edge due to its unique advantages, including easy administration, automatic scaling, built-in fault tolerance, etc. Nevertheless, is also facing challenges such as long latency cold start. In this paper, we present an in-depth performance analysis start framework propose HotC, a container-based runtime management that leverages lightweight containers mitigate improve...

10.1109/cluster48925.2021.00018 article EN 2021-09-01

Phase field simulation of columnar grain formation induced by pore migration in UO2

OPENALEX - Publications

Caiyan Liu Yunpeng Zhang Dazhao Cheng Lei Shao Changqing Teng and 2 more

10.1016/j.jeurceramsoc.2025.117264 article EN Journal of the European Ceramic Society 2025-02-01

Spread+: Scalable Model Aggregation in Federated Learning with Non-IID Data

OPENALEX - Publications

Huanghuang Liang Xin Yang Xiaoming Han B. X. Liu Chuang Hu and 3 more

10.1109/tpds.2025.3539738 article EN IEEE Transactions on Parallel and Distributed Systems 2025-01-01

Harnessing Inter-GPU Shared Memory for Seamless MoE Communication-Computation Fusion

OPENALEX - Publications

H. Wang Yaqi Xia Donglin Yang Xiaobo Zhou Dazhao Cheng

10.1145/3710848.3710868 article EN 2025-02-28

Streamlining Data Transfer in Collaborative SLAM Through Bandwidth-Aware Map Distillation

OPENALEX - Publications

Rui Ge Huanghuang Liang Zheng Gong Chuang Hu Xiaobo Zhou and 1 more

10.1109/tmc.2025.3549367 article EN IEEE Transactions on Mobile Computing 2025-01-01

Priva: privacy reversible and intelligible video analytics system through diffusion restoration models

OPENALEX - Publications

Huixian Feng Chuang Hu Dazhao Cheng

10.1117/1.jei.34.2.023022 article EN Journal of Electronic Imaging 2025-03-19

GVA: general content-aware feature map reusing structure for edge-side video analytics

OPENALEX - Publications

Yuqing Zhang Chuang Hu Dazhao Cheng

10.1117/1.jei.34.2.023010 article EN Journal of Electronic Imaging 2025-03-12

Understanding the Challenges Students Face in Non-English Programming Environments Due to the Programming Language Transition: A Case Study of Keywords in the Chinese Version of Scratch

OPENALEX - Publications

Siyu Wang Janice Jianing Huanghuang Liang Chuang Hu Yujun Zhu and 3 more

10.1145/3706598.3713446 article EN 2025-04-24

Heterogeneity-Aware Workload Placement and Migration in Distributed Sustainable Datacenters

OPENALEX - Publications

Dazhao Cheng Changjun Jiang Xiaobo Zhou

While major cloud service operators have taken various initiatives to operate their sustainable data enters with green energy, it is challenging effectively utilize the energy since its generation depends on dynamic natural conditions. Fortunately, geographical distribution of provides an opportunity for optimizing system performance by distributing workloads. In this paper, we propose a holistic heterogeneity-aware workload placement and migration approach, sCloud, that aims maximize good...

10.1109/ipdps.2014.41 article EN 2014-05-01

Deadline-Aware MapReduce Job Scheduling with Dynamic Resource Availability

OPENALEX - Publications

Dazhao Cheng Xiaobo Zhou Yinggen Xu Liu Liu Changjun Jiang

As MapReduce is becoming ubiquitous in large-scale data analysis, many recent studies have shown that the performance of could be improved by different job scheduling approaches, e.g., Fair Scheduler and Capacity Scheduler. However, most exiting schedulers focus on scenario cluster stable pay little attention to with dynamic resource availability. In fact, resources may fluctuate as there a growing number Hadoop clusters deployed hybrid systems, infrastructure powered mix traditional...

10.1109/tpds.2018.2873373 article EN IEEE Transactions on Parallel and Distributed Systems 2018-10-01

Redundancy-Free High-Performance Dynamic GNN Training with Hierarchical Pipeline Parallelism

OPENALEX - Publications

Yaqi Xia Zheng Zhang H. Wang Donglin Yang Xiaobo Zhou and 1 more

Temporal Graph Neural Networks(TGNNs) extend the success of Networks to dynamic graphs. Distributed TGNN training requires efficiently tackling temporal dependency, which often leads excessive cross-device communication that generates significant redundant data. However, existing systems are unable remove redundancy in data reuse and transfer, suffer from severe overhead a distributed setting. This paper presents Sven, an algorithm system co-designed library for end-to-end performance...

10.1145/3588195.3592990 article EN 2023-08-07

SLO-Aware Function Placement for Serverless Workflows With Layer-Wise Memory Sharing

OPENALEX - Publications

Dazhao Cheng Kai Yan X.T. Cai Yili Gong Chuang Hu

Function-as-a-Service (FaaS) is a promising cloud computing model known for its scalability and elasticity. In various application domains, FaaS workflows have been widely adopted to manage user requests complete computational tasks efficiently. Motivated by the fact that function containers collaboratively use image layer's memory, co-placing functions would leverage memory sharing reduce cluster footprint, this paper studies layer- wise serverless functions. We find overwhelming placing in...

10.1109/tpds.2024.3391858 article EN IEEE Transactions on Parallel and Distributed Systems 2024-04-22

Towards Energy Efficiency in Heterogeneous Hadoop Clusters by Adaptive Task Assignment

OPENALEX - Publications

Dazhao Cheng Palden Lama Changjun Jiang Xiaobo Zhou

The cost of powering servers, storage platforms and related cooling systems has become a major component the operational costs in big data deployments. Hence, design energy-efficient Hadoop clusters attracted significant research attentions recent years. However, existing studies do not consider impact complex interplay between workload hardware heterogeneity on energy efficiency. In this paper, we find that heterogeneity-oblivious task assignment approaches are detrimental to both...

10.1109/icdcs.2015.44 article EN 2015-06-01

Coming Soon ...