NFDI4DS | UHH-SEMS - Publication Details

Li Zhang

ORCID: 0000-0003-4837-0900

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100425779

Research Areas

Cloud Computing and Resource Management
IoT and Edge/Fog Computing
Distributed and Parallel Computing Systems
Advanced Malware Detection Techniques
Advanced Data Storage Technologies
Distributed systems and fault tolerance
Software System Performance and Reliability
Software Testing and Debugging Techniques
Graph Theory and Algorithms
Network Security and Intrusion Detection
Software-Defined Networks and 5G
Caching and Content Delivery
Parallel Computing and Optimization Techniques
Digital and Cyber Forensics
Advanced Queuing Theory Analysis
Software Engineering Research
Security and Verification in Computing
Age of Information Optimization
Advanced Database Systems and Queries
Network Traffic and Congestion Control
Software Reliability and Analysis Research
Data Stream Mining Techniques
Information and Cyber Security
Cloud Data Security Solutions
Big Data Technologies and Applications

China University of Petroleum, Beijing
2022-2024

Kelun Group (China)
2024

Communication University of Zhejiang
2023

Amazon (Germany)
2021

China University of Petroleum, East China
2019-2020

Tsinghua University
2020

Institute for Infocomm Research
2018

Agency for Science, Technology and Research
2018

IBM Research - Thomas J. Watson Research Center
2004-2017

IBM Research - Almaden
2017

Improving the Scalability of Data Center Networks with Traffic-aware Virtual Machine Placement

OPENALEX - Publications

Xiaoqiao Meng Vasileios Pappas Li Zhang

The scalability of modern data centers has become a practical concern and attracted significant attention in recent years. In contrast to existing solutions that require changes the network architecture routing protocols, this paper proposes using traffic-aware virtual machine (VM) placement improve scalability. By optimizing VMs on host machines, traffic patterns among can be better aligned with communication distance between them, e.g. large mutual bandwidth usage are assigned machines...

10.1109/infcom.2010.5461930 article EN 2010-03-01

Efficient resource provisioning in compute clouds via VM multiplexing

OPENALEX - Publications

Xiaoqiao Meng Canturk Isci Jeffrey O. Kephart Li Zhang Eric Bouillet and 1 more

Resource provisioning in compute clouds often require an estimate of the capacity needs Virtual Machines (VMs). The estimated VM size is basis for allocating resources commensurate with workload demand. In contrast to traditional practice estimating sizes individually, we propose a joint-VM sizing approach which multiple VMs are consolidated and provisioned, based on their aggregate needs. This new exploits statistical multiplexing among patterns VMs, i.e., peaks valleys one pattern do not...

10.1145/1809049.1809052 article EN 2010-06-07

Consolidating virtual machines with dynamic bandwidth demand in data centers

OPENALEX - Publications

Meng Wang Xiaoqiao Meng Li Zhang

Recent advances in virtualization technology have made it a common practice to consolidate virtual machines(VMs) into fewer number of servers. An efficient consolidation scheme requires that VMs are packed tightly, yet receive resources commensurate with their demands. However, measurements from production data centers show the network bandwidth demands dynamic, making difficult characterize by fixed value and apply traditional schemes. In this work, we formulate VM Stochastic Bin Packing...

10.1109/infcom.2011.5935254 article EN 2011-04-01

SparkBench

OPENALEX - Publications

Min Li Jian Tan Yandong Wang Li Zhang Valentina Salapura

Spark has been increasingly adopted by industries in recent years for big data analysis providing a fault tolerant, scalable and easy-to-use memory abstraction. Moreover, the community actively developing rich ecosystem around Spark, making it even more attractive. However, there is not yet specify benchmark existing literature to guide development cluster deployment of better fit resource demands user applications. In this paper, we present SparkBench, specific benchmarking suite, which...

10.1145/2742854.2747283 article EN 2015-05-06

Energy-Aware Autonomic Resource Allocation in Multitier Virtualized Environments

OPENALEX - Publications

Danilo Ardagna Barbara Panicucci Marco Trubian Li Zhang

With the increase of energy consumption associated with IT infrastructures, management is becoming a priority in design and operation complex service-based systems. At same time, service providers need to comply Service Level Agreement (SLA) contracts which determine revenues penalties on basis achieved performance level. This paper focuses resource allocation problem multitier virtualized systems goal maximizing SLAs revenue while minimizing costs. The main novelty our approach address—in...

10.1109/tsc.2010.42 article EN IEEE Transactions on Services Computing 2010-09-07

MapTask Scheduling in MapReduce With Data Locality: Throughput and Heavy-Traffic Optimality

OPENALEX - Publications

Weina Wang K. J. Zhu Lei Ying Jian Tan Li Zhang

MapReduce/Hadoop framework has been widely used to process large-scale datasets on computing clusters. Scheduling map tasks with data locality consideration is crucial the performance of MapReduce. Many works have devoted increasing for better efficiency. However, best our knowledge, fundamental limits MapReduce clusters locality, including capacity region and theoretical bounds delay performance, not well studied. In this paper, we address these problems from a stochastic network...

10.1109/tnet.2014.2362745 article EN IEEE/ACM Transactions on Networking 2014-11-05

MRONLINE

OPENALEX - Publications

Min Li Liangzhao Zeng Shicong Meng Jian Tan Li Zhang and 2 more

MapReduce job parameter tuning is a daunting and time consuming task. The configuration space huge; there are more than 70 parameters that impact performance. It also difficult for users to determine suitable values the without first having good understanding of application characteristics. Thus, it challenge systematically explore select near-optimal configuration. Extant offline approaches slow inefficient as they entail multiple test runs significant human effort.

10.1145/2600212.2600229 article EN 2014-06-20

Three decades of deception techniques in active cyber defense - Retrospect and outlook

OPENALEX - Publications

Li Zhang Vrizlynn L. L. Thing

10.1016/j.cose.2021.102288 article EN Computers & Security 2021-04-18

SLA based profit optimization in autonomic computing systems

OPENALEX - Publications

Li Zhang Danilo Ardagna

With the development of Service Oriented Architecture (SOA), organizations are able to compose complex applications from distributed services supported by third party providers. Under this scenario, large data centers provide many customers sharing available IT resources. This leads efficient use resources and reduction operating costs. providers their often negotiate utility based Level Agreements (SLAs) determine costs penalties on achieved performance levels. Data employ an autonomic...

10.1145/1035167.1035193 article EN 2004-11-15

SLA based resource allocation policies in autonomic environments

OPENALEX - Publications

Danilo Ardagna Marco Trubian Li Zhang

10.1016/j.jpdc.2006.10.006 article EN Journal of Parallel and Distributed Computing 2007-01-23

A Hierarchical Approach for the Resource Management of Very Large Cloud Platforms

OPENALEX - Publications

Bernardetta Addis Danilo Ardagna Barbara Panicucci Mark S. Squillante Li Zhang

Worldwide interest in the delivery of computing and storage capacity as a service continues to grow at rapid pace. The complexities such cloud centers require advanced resource management solutions that are capable dynamically adapting platform while providing continuous performance guarantees. goal this paper is devise allocation policies for virtualized environments satisfy availability guarantees minimize energy costs very large centers. We present scalable distributed hierarchical...

10.1109/tdsc.2013.4 article EN IEEE Transactions on Dependable and Secure Computing 2013-08-29

Delay tails in MapReduce scheduling

OPENALEX - Publications

Jian Tan Xiaoqiao Meng Li Zhang

MapReduce/Hadoop production clusters exhibit heavy-tailed characteristics for job processing times. These phenomena are resultant of the workload features and adopted scheduling algorithms. Analytically understanding delays under different schedulers MapReduce can facilitate design deployment large Hadoop clusters. The map reduce tasks a have fundamental difference tight dependence between them, complicating analysis. This also leads to an interesting starvation problem with widely used Fair...

10.1145/2254756.2254761 article EN 2012-06-11

Coupling task progress for MapReduce resource-aware scheduling

OPENALEX - Publications

Jian Tan Xiaoqiao Meng Li Zhang

Schedulers are critical in enhancing the performance of MapReduce/Hadoop presence multiple jobs with different characteristics and goals. Though current schedulers for Hadoop quite successful, they still have room improvement: map tasks (MapTasks) reduce (ReduceTasks) not jointly optimized, albeit there is a strong dependence between them. This can cause job starvation unfavorable data locality. In this paper, we design implement resource-aware scheduler Hadoop. It couples progresses...

10.1109/infcom.2013.6566958 article EN 2013-04-01

Map task scheduling in MapReduce with data locality: Throughput and heavy-traffic optimality

OPENALEX - Publications

Weina Wang K. J. Zhu Lei Ying Jian Tan Li Zhang

Scheduling map tasks to improve data locality is crucial the performance of MapReduce. Many works have been devoted increasing for better efficiency. However, best our knowledge, fundamental limits MapReduce computing clusters with locality, including capacity region and theoretical bounds on delay performance, not studied. In this paper, we address these problems from a stochastic network perspective. Our focus strike right balance between data-locality load-balancing simultaneously...

10.1109/infcom.2013.6566957 article EN 2013-04-01

MEMTUNE: Dynamic Memory Management for In-Memory Data Analytic Platforms

OPENALEX - Publications

Luna Xu Min Li Li Zhang Ali R. Butt Yandong Wang and 1 more

Memory is a crucial resource for big data processing frameworks such as Spark and M3R, where the memory used both computation caching intermediate storage data. Consequently, optimizing key to extracting high performance. The extant approach statically split based on workload profiling. This unable capture varying characteristics dynamic demands. Another factor that affects efficiency choice of placement eviction policy. LRU policy oblivious task scheduling information from analytic...

10.1109/ipdps.2016.105 article EN 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2016-05-01

A scalable and extensible framework for android malware detection and family attribution

OPENALEX - Publications

Li Zhang Vrizlynn L. L. Thing Yao Cheng

10.1016/j.cose.2018.10.001 article EN Computers & Security 2018-10-09

Autonomic Management of Cloud Service Centers with Availability Guarantees

OPENALEX - Publications

Bernardetta Addis Danilo Ardagna Barbara Panicucci Li Zhang

Modern cloud infrastructures live in an open world, characterized by continuous changes the environment and requirements they have to meet. Continuous occur autonomously unpredictably, are out of control provider. Therefore, advanced solutions be developed able dynamically adapt infrastructure, while providing service performance guarantees. A number autonomic computing been such that resources allocated among running applications on basis short-term demand estimates. However, only energy...

10.1109/cloud.2010.19 article EN 2010-07-01

Joint optimization of overlapping phases in MapReduce

OPENALEX - Publications

Minghong Lin Li Zhang Adam Wierman Jian Tan

10.1016/j.peva.2013.08.013 article EN Performance Evaluation 2013-08-28

Improving ReduceTask data locality for sequential MapReduce jobs

OPENALEX - Publications

Jian Tan Shicong Meng Xiaoqiao Meng Li Zhang

Improving data locality for MapReduce jobs is critical the performance of large-scale Hadoop clusters, embodying principle moving computation close to big platforms. Scheduling tasks in vicinity stored can significantly diminish network traffic, which crucial system stability and efficiency. Though issue on has been investigated extensively MapTasks, most existing schedulers ignore ReduceTasks when fetching intermediate data, causing degradation. This problem reducing cost identified...

10.1109/infcom.2013.6566959 article EN 2013-04-01

HydraDB

OPENALEX - Publications

Yan‐Dong Wang Li Zhang Jian Tan Min Li Yuqing Gao and 3 more

In this paper, we describe our experiences and lessons learned from building a general-purpose in-memory key-value middleware, called HydraDB. HydraDB synthesizes collection of state-of-the-art techniques, including continuous fault-tolerance, Remote Direct Memory Access (RDMA), as well awareness for multicore systems, etc, to deliver high-throughput, low-latency access service in reliable manner cluster computing applications.

10.1145/2807591.2807614 article EN 2015-10-27

Multiplex Visibility Graphs-Based Hybrid Deep Learning Method for Recognizing Pipeline Operation Conditions Using Operating Data

OPENALEX - Publications

Li Zhang Peng Wang L. Liu Jun Liu Zhenghua Chen and 3 more

Abstract Accurate recognition of system operating conditions is the basis for ensuring safe and stable operation oil gas pipeline network systems. However, existing condition mainly relies on manual experience, which cannot dynamically track changes status, data stored in SCADA lacks labels, makes it difficult to explore potential value data. In this work, a hybrid neural model based multiplex visibility graphs (MVG) proposed classification. Firstly, generative adversarial algorithms (GANs)...

10.2523/iptc-25065-ms article EN International Petroleum Technology Conference 2025-02-17

Performance analysis of Coupling Scheduler for MapReduce/Hadoop

OPENALEX - Publications

Jian Tan Xiaoqiao Meng Li Zhang

For MapReduce/Hadoop, map and reduce phases exhibit fundamentally distinguishing characteristics. Additionally, these two admit complicated tight dependency on each other, causing the repeatedly observed starvation problem with widely used Fair Scheduler. To mitigate this problem, we design Coupling Scheduler, which, among other new features, jointly schedules tasks by coupling their progresses, different from existing ones that treat them separately. This is based intuition allocating...

10.1109/infcom.2012.6195658 article EN 2012-03-01

C-Hint

OPENALEX - Publications

Yandong Wang Xiaoqiao Meng Li Zhang Jian Tan

Recently, many in-memory key-value stores have started using a High-Performance network protocol, Remote Direct Memory Access (RDMA), to provision ultra-low latency access services. Among various solutions, previous studies recognized that leveraging RDMA Read optimize GET operations and continuing message passing for other requests can offer tremendous performance improvement while avoiding read-write races. However, although such design utilize the power of when there is sufficient memory...

10.1145/2670979.2671002 article EN 2014-11-03

Coming Soon ...