NFDI4DS | UHH-SEMS - Publication Details

Wei Chen

ORCID: 0000-0002-9402-6249

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100344511

Research Areas

Cloud Computing and Resource Management
Distributed and Parallel Computing Systems
IoT and Edge/Fog Computing
Parallel Computing and Optimization Techniques
Software System Performance and Reliability
Caching and Content Delivery
Advanced Data Storage Technologies
Graph Theory and Algorithms
Software-Defined Networks and 5G
Network Security and Intrusion Detection
Advanced Database Systems and Queries
Big Data and Business Intelligence
Service-Oriented Architecture and Web Services
Technology and Security Systems
Data Management and Algorithms
Data Stream Mining Techniques
Artificial Intelligence in Healthcare
Advanced Computational Techniques and Applications
Metallurgy and Material Science

Alibaba Group (United States)
2022

Guangzhou Vocational College of Science and Technology
2022

University of Colorado Colorado Springs
2017-2019

Wayne State University
2019

Shanghai University of Electric Power
2018

University of Science and Technology Beijing
2017

Chinese Academy of Sciences
2014-2015

Beijing Institute of Technology
2015

South China University of Technology
2013

Institute of Software
2012

MORM: A Multi-objective Optimized Replication Management strategy for cloud storage cluster

OPENALEX - Publications

Saiqin Long Yuelong Zhao Wei Chen

10.1016/j.sysarc.2013.11.012 article EN Journal of Systems Architecture 2013-12-12

Map-Balance-Reduce: An improved parallel programming model for load balancing of MapReduce

OPENALEX - Publications

Jianjiang Li Yajun Liu Jian Pan Peng Zhang Wei Chen and 1 more

10.1016/j.future.2017.03.013 article EN Future Generation Computer Systems 2017-03-18

A Profit-Aware Virtual Machine Deployment Optimization Framework for Cloud Platform Providers

OPENALEX - Publications

Wei Chen Xiaoqiang Qiao Jun Wei Tao Huang

As a rising application paradigm, cloud computing enables the resources to be virtualized and shared among applications. In typical scenario, customers, Service Providers (SP), Platform (PP) are independent participants, they have their own objectives with different revenues costs. From PPs' viewpoints, much research work reduced costs by optimizing VM placement deciding when how perform migrations. However, some ignored fact that balanced use of multi-dimensional can affect overall resource...

10.1109/cloud.2012.60 article EN 2012-06-01

A three-phase energy-saving strategy for cloud storage systems

OPENALEX - Publications

Saiqin Long Yuelong Zhao Wei Chen

10.1016/j.jss.2013.08.018 article EN Journal of Systems and Software 2013-08-27

Dependency-Aware Network Adaptive Scheduling of Data-Intensive Parallel Jobs

OPENALEX - Publications

Shaoqi Wang Wei Chen Xiaobo Zhou Liqiang Zhang Yin Wang

Datacenter clusters often run data-intensive jobs in parallel for improving resource utilization and cost efficiency. The performance of is constrained by the cluster's hard-to-scale network bisection bandwidth. Various solutions have been proposed to address issue, however, most them do not consider inter-job data dependencies schedule independently from one another. In this work, we find that aggregating co-locating tasks dependent offer an extra opportunity locality improvement can help...

10.1109/tpds.2018.2866993 article EN publisher-specific-oa IEEE Transactions on Parallel and Distributed Systems 2018-08-27

Aggressive Synchronization with Partial Processing for Iterative ML Jobs on Clusters

OPENALEX - Publications

Shaoqi Wang Wei Chen Aidi Pi Xiaobo Zhou

Executing distributed machine learning (ML) jobs on Spark follows Bulk Synchronous Parallel (BSP) model, where parallel tasks execute the same iteration at time and generated updates must be synchronized parameters when all are finished. However, rarely have execution due to sparse data so that synchronization has wait for finished late. Moreover, running heterogeneous clusters makes it even worse because of stragglers, is significantly delayed by slowest task.

10.1145/3274808.3274828 article EN 2018-11-26

Preemptive and Low Latency Datacenter Scheduling via Lightweight Containers

OPENALEX - Publications

Wei Chen Xiaobo Zhou Jia Rao

Datacenters are evolving to host heterogeneous workloads on shared clusters reduce the operational cost and achieve higher resource utilization. However, it is challenging schedule with diverse requirements QoS constraints. On one hand, latency-critical jobs need be scheduled as soon they submitted avoid any queuing delays. other best-effort long should allowed occupy cluster when there idle resources improve The challenge lies in how minimize delays of short while maximizing In this...

10.1109/tpds.2019.2957754 article EN publisher-specific-oa IEEE Transactions on Parallel and Distributed Systems 2019-12-05

Characterizing Scheduling Delay for Low-Latency Data Analytics Workloads

OPENALEX - Publications

Wei Chen Aidi Pi Shaoqi Wang Xiaobo Zhou

Data analytics workloads are shifting to shorter task execution time, higher degree of parallelism, and on faster hardware. As a result, job scheduling is becoming bottleneck, which needs offer extreme low-latency, massive throughput, high scalability. However, few efforts have been focused systematically understanding the delay. In this paper, we propose method develop tool, SD-checker, that decomposes delay into multiple components characterizes each by extensive experiments. SDchecker...

10.1109/ipdps.2018.00072 article EN 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2018-05-01

vNetTracer: Efficient and Programmable Packet Tracing in Virtualized Networks

OPENALEX - Publications

Kun Suo Yong Zhao Wei Chen Jia Rao

As the scale of cloud systems continues to grow, virtualized networks that provide connectivity between services within and across data centers, are becoming increasingly important performance reliability cloud. Despite many advantages, including fast deployment, ease management, programmability, require additional layers abstraction complicate monitoring diagnosis issues compared traditional on physical hardware. Virtualized usually connect components in multiple protection domains, such as...

10.1109/icdcs.2018.00026 article EN 2018-07-01

Pufferfish

OPENALEX - Publications

Wei Chen Aidi Pi Shaoqi Wang Xiaobo Zhou

Data-intensive applications often suffer from significant memory pressure, resulting in excessive garbage collection (GC) and out-of-memory (OOM) errors, harming system performance reliability. In this paper, we demonstrate how lightweight virtualization via OS containers opens up opportunities to address pressure realize elasticity: 1) tasks running a container can be set large heap size avoid OutOfMemory 2) that are under incur swapping activities temporarily "suspended" by depriving...

10.1145/3357223.3362730 article EN 2019-11-11

Fine-grained modeling and optimization for intelligent resource management in big data processing

OPENALEX - Publications

Chenghao Lyu Qi Fan Fei Song Arnab Sinha Yanlei Diao and 6 more

Big data processing at the production scale presents a highly complex environment for resource optimization (RO), problem crucial meeting performance goals and budgetary constraints of analytical users. The RO is challenging because it involves set decisions (the partition count, placement parallel instances on machines, allocation to each instance), requires multi-objective (MOO), compounded by complexity big systems while having meet stringent time scheduling. This paper MaxCompute-based...

10.14778/3551793.3551855 article EN Proceedings of the VLDB Endowment 2022-07-01

Improving Utilization and Parallelism of Hadoop Cluster by Elastic Containers

OPENALEX - Publications

Yinggen Xu Wei Chen Shaoqi Wang Xiaobo Zhou Changjun Jiang

Modern datacenter schedulers apply a static policy to partition resources among different tasks. The amount of allocated resource won't get changed during task's lifetime. However, we found that usage runtime demonstrates high dynamics and it only reaches full at few moments. Therefore, the allocation doesn't exploit dynamic nature usage, leading low system utilization. To address this hard problem, recently proposed task-consolidation approach packs as many tasks possible on same node based...

10.1109/infocom.2018.8486400 article EN IEEE INFOCOM 2022 - IEEE Conference on Computer Communications 2018-04-01

PETS: Bottleneck-Aware Spark Tuning with Parameter Ensembles

OPENALEX - Publications

Tiago Barreto Goes Perez Wei Chen Raymond Ji Liu Liu Xiaobo Zhou

Spark tuning with its dozens of parameters for performance improvement is both a challenge and time consuming effort. Current techniques rely on trial-and-error or best guess utilizing expert knowledge that very few posses. Previous works are not compatible also ignore the underlying problem resource bottlenecks cause issues, potential ally, if awareness leveraged in directing to be more effective. We propose develop PETS, new method allows associated at same time, using bottleneck adjust...

10.1109/icccn.2018.8487324 article EN 2018-07-01

Otterman: A Novel Approach of Spark Auto-tuning by a Hybrid Strategy

OPENALEX - Publications

Haizhou Du Ping Han Wei Chen Yi Wang Chenlu Zhang

Spark has become a very attractive platform for big data analytics in recent years due to its unique advantages such as parallelism, fault tolerance, and complexity associated with clusters setup. On the spark platform, users can adjust parameter configurations according different job requirements specific applications optimize performance. This leads problem that we can't ignore, already more than 180 parameters, huge combination of parameters means rely on manual tuning grasp impact all In...

10.1109/icsai.2018.8599304 article EN 2018-11-01

Profiling Distributed Systems in Lightweight Virtualized Environments with Logs and Resource Metrics

OPENALEX - Publications

Aidi Pi Wei Chen Xiaobo Zhou

Understanding and troubleshooting distributed systems in the cloud is considered a very difficult problem because execution of single user request to multiple machines. Further, multi-tenancy nature environments further introduces interference that causes performance issues. Most existing tools either focus on log analysis or intrusive tracing methods, leaving resource usage monitoring unexplored.

10.1145/3220192.3220197 article EN 2018-06-07

Addressing Skewness in Iterative ML Jobs with Parameter Partition

OPENALEX - Publications

Shaoqi Wang Wei Chen Xiaobo Zhou Sang–Yoon Chang Mike Ji

Computational skewness is a significant challenge in multi-tenant data-parallel clusters that introduce dynamic heterogeneity of machine capacity distributed data processing. Previous efforts to addressing mostly focus on batch jobs based the assumption processing time linearly dependent size partitioned data. However, they are illsuited for iterative learning (ML) jobs, which (1) exhibit non-linear relationship between parameters and within each iteration, (2) show an explicit binding input...

10.1109/infocom.2019.8737583 article EN IEEE INFOCOM 2022 - IEEE Conference on Computer Communications 2019-04-01

A Two-Level Virtual Machine Self-Reconfiguration Mechanism for the Cloud Computing Platforms

OPENALEX - Publications

Wei Chen Xiaoqiang Qiao Jun Wei Tao Huang

Cloud computing is a new model and technology that leverage the efficient pooling of on-demand, self-managed virtual infrastructure. Virtualization packages applications in form Virtual Machine (VM) provides significant benefits by reconfiguring VMs dynamically. VM reconfiguration hard complicated, existing work addressed problem with diverse objectives answering questions when to reconfigure, which should be reconfigured where host VMs. However, we found runtime affects total costs...

10.1109/uic-atc.2012.39 article EN 2012-09-01

Addressing Memory Pressure in Data-intensive Parallel Programs via Container Based Virtualization

OPENALEX - Publications

Wei Chen Jia Rao Xiaobo Zhou

Out-of-memory (OOM) errors and excessive garbage collection (GC) activities are common issues in dataintensive parallel programs, which cause not only poor performance but also execution failures. A recent study [1] proposed a new programming model to address the memory pressure data-parallel programs. The iTask proactively reclaims avoid OOM reduce GC time. Although effective, it requires extensive changes program.In this paper, we show that lightweight virtualization, such as OS...

10.1109/icac.2017.28 article EN 2017-07-01

Policy Based Power Management in Cloud Environment with Intel Intelligent Power Node Manager

OPENALEX - Publications

Wei Chen Fengqian Gao Yuan Lu

Open Source Private Cloud (OSPC) is a full stack of private cloud solution based on Stack to help user enable and manage environment. Intel Intelligent Power Node Manager platform resident technology with power thermal policies. In this paper, we introduce how in computing effectively by integrating OSPC define resolution as management policy. Live migration supported Xen implemented new policy balance the load our experiment, Prime95 used benchmark verify effectiveness The result shows that...

10.1109/edocw.2012.18 article EN 2012-09-01

A Virtual Machine Placement and Reconfiguration Framework for Cloud Computing Platforms

OPENALEX - Publications

Wei Chen Xiaoqiang Qiao Jun Wei Hua Zhong Tao Huang

As a rising application paradigm and technology, cloud computing can leverage the efficient pooling of on-demand, self-managed virtual infrastructure. How to maximize resource utilization how reduce cost configuration are essential issues in computing. In this paper, authors propose framework achieve these objectives by optimizing VM placement deciding when perform reconfigurations. The vector arithmetic model objective balancing multiple an optimization method for static placement. Then...

10.4018/ijaras.2014040101 article EN International Journal of Adaptive Resilient and Autonomic Systems 2014-04-01

Demo/poster abstract: Efficient and flexible packet tracing for virtualized networks using eBPF

OPENALEX - Publications

Kun Suo Yong Zhao Wei Chen Jia Rao

As the scale of cloud systems continues to grow, virtualized networks are becoming increasingly important performance and reliability cloud. Despite many advantages, introduce additional layers abstraction more difficult monitor diagnose issues compared traditional networks. Furthermore, it is challenging reason about dynamic Therefore, there a great need for fine-grained, user customizable, reconfigurable network tracing. To address above challenges, we propose vNetTracer, an efficient...

10.1109/infcomw.2018.8406849 article EN IEEE INFOCOM 2022 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS) 2018-04-01

Design of a flow integrating information system based on cloud computing

OPENALEX - Publications

Renjie Li Jian Jiang Wei Chen

Vast amounts of data is collected into flow measurement system for the primary purpose trade settlement, fairness and justice entire settlement depends entirely on integrity integrating system.Therefore, role very important.Measurement equipment traditional expensive, but information release ability poor operation maintenance complex.To solve this problem, paper design a based cloud computing technology.The uses centre Hadoop to manage massive data, mathematical models compute application...

10.2991/icismme-15.2015.301 article EN cc-by-nc Advances in intelligent systems research/Advances in Intelligent Systems Research 2015-01-01

Lero: A Learning-to-Rank Query Optimizer

OPENALEX - Publications

Rong Zhu Wei Chen Bolin Ding Xingguang Chen Andreas Pfadler and 2 more

A recent line of works apply machine learning techniques to assist or rebuild cost-based query optimizers in DBMS. While exhibiting superiority some benchmarks, their deficiencies, e.g., unstable performance, high training cost, and slow model updating, stem from the inherent hardness predicting cost latency execution plans using models. In this paper, we introduce a learning-to-rank optimizer, called Lero, which builds on top native optimizer continuously learns improve optimization...

10.48550/arxiv.2302.06873 preprint EN cc-by arXiv (Cornell University) 2023-01-01

mBalloon

OPENALEX - Publications

Wei Chen Aidi Pi Jia Rao Xiaobo Zhou

Big Data processing often suffers from significant memory pressure, resulting in excessive garbage collection (GC) and out-of-memory (OOM) errors, harming system performance reliability. Therefore, users tend to give an heap size applications avoid job failure, causing low cluster utilization.

10.1145/3127479.3132565 article EN 2017-09-24

Coming Soon ...