- Cloud Computing and Resource Management
- IoT and Edge/Fog Computing
- Distributed and Parallel Computing Systems
- Advanced Malware Detection Techniques
- Advanced Data Storage Technologies
- Distributed systems and fault tolerance
- Software System Performance and Reliability
- Software Testing and Debugging Techniques
- Graph Theory and Algorithms
- Network Security and Intrusion Detection
- Software-Defined Networks and 5G
- Caching and Content Delivery
- Parallel Computing and Optimization Techniques
- Digital and Cyber Forensics
- Advanced Queuing Theory Analysis
- Software Engineering Research
- Security and Verification in Computing
- Age of Information Optimization
- Advanced Database Systems and Queries
- Network Traffic and Congestion Control
- Software Reliability and Analysis Research
- Data Stream Mining Techniques
- Information and Cyber Security
- Cloud Data Security Solutions
- Big Data Technologies and Applications
China University of Petroleum, Beijing
2022-2024
Kelun Group (China)
2024
Communication University of Zhejiang
2023
Amazon (Germany)
2021
China University of Petroleum, East China
2019-2020
Tsinghua University
2020
Institute for Infocomm Research
2018
Agency for Science, Technology and Research
2018
IBM Research - Thomas J. Watson Research Center
2004-2017
IBM Research - Almaden
2017
The scalability of modern data centers has become a practical concern and attracted significant attention in recent years. In contrast to existing solutions that require changes the network architecture routing protocols, this paper proposes using traffic-aware virtual machine (VM) placement improve scalability. By optimizing VMs on host machines, traffic patterns among can be better aligned with communication distance between them, e.g. large mutual bandwidth usage are assigned machines...
Resource provisioning in compute clouds often require an estimate of the capacity needs Virtual Machines (VMs). The estimated VM size is basis for allocating resources commensurate with workload demand. In contrast to traditional practice estimating sizes individually, we propose a joint-VM sizing approach which multiple VMs are consolidated and provisioned, based on their aggregate needs. This new exploits statistical multiplexing among patterns VMs, i.e., peaks valleys one pattern do not...
Recent advances in virtualization technology have made it a common practice to consolidate virtual machines(VMs) into fewer number of servers. An efficient consolidation scheme requires that VMs are packed tightly, yet receive resources commensurate with their demands. However, measurements from production data centers show the network bandwidth demands dynamic, making difficult characterize by fixed value and apply traditional schemes. In this work, we formulate VM Stochastic Bin Packing...
Spark has been increasingly adopted by industries in recent years for big data analysis providing a fault tolerant, scalable and easy-to-use memory abstraction. Moreover, the community actively developing rich ecosystem around Spark, making it even more attractive. However, there is not yet specify benchmark existing literature to guide development cluster deployment of better fit resource demands user applications. In this paper, we present SparkBench, specific benchmarking suite, which...
With the increase of energy consumption associated with IT infrastructures, management is becoming a priority in design and operation complex service-based systems. At same time, service providers need to comply Service Level Agreement (SLA) contracts which determine revenues penalties on basis achieved performance level. This paper focuses resource allocation problem multitier virtualized systems goal maximizing SLAs revenue while minimizing costs. The main novelty our approach address—in...
MapReduce/Hadoop framework has been widely used to process large-scale datasets on computing clusters. Scheduling map tasks with data locality consideration is crucial the performance of MapReduce. Many works have devoted increasing for better efficiency. However, best our knowledge, fundamental limits MapReduce clusters locality, including capacity region and theoretical bounds delay performance, not well studied. In this paper, we address these problems from a stochastic network...
MapReduce job parameter tuning is a daunting and time consuming task. The configuration space huge; there are more than 70 parameters that impact performance. It also difficult for users to determine suitable values the without first having good understanding of application characteristics. Thus, it challenge systematically explore select near-optimal configuration. Extant offline approaches slow inefficient as they entail multiple test runs significant human effort.
With the development of Service Oriented Architecture (SOA), organizations are able to compose complex applications from distributed services supported by third party providers. Under this scenario, large data centers provide many customers sharing available IT resources. This leads efficient use resources and reduction operating costs. providers their often negotiate utility based Level Agreements (SLAs) determine costs penalties on achieved performance levels. Data employ an autonomic...
Worldwide interest in the delivery of computing and storage capacity as a service continues to grow at rapid pace. The complexities such cloud centers require advanced resource management solutions that are capable dynamically adapting platform while providing continuous performance guarantees. goal this paper is devise allocation policies for virtualized environments satisfy availability guarantees minimize energy costs very large centers. We present scalable distributed hierarchical...
MapReduce/Hadoop production clusters exhibit heavy-tailed characteristics for job processing times. These phenomena are resultant of the workload features and adopted scheduling algorithms. Analytically understanding delays under different schedulers MapReduce can facilitate design deployment large Hadoop clusters. The map reduce tasks a have fundamental difference tight dependence between them, complicating analysis. This also leads to an interesting starvation problem with widely used Fair...
Schedulers are critical in enhancing the performance of MapReduce/Hadoop presence multiple jobs with different characteristics and goals. Though current schedulers for Hadoop quite successful, they still have room improvement: map tasks (MapTasks) reduce (ReduceTasks) not jointly optimized, albeit there is a strong dependence between them. This can cause job starvation unfavorable data locality. In this paper, we design implement resource-aware scheduler Hadoop. It couples progresses...
Scheduling map tasks to improve data locality is crucial the performance of MapReduce. Many works have been devoted increasing for better efficiency. However, best our knowledge, fundamental limits MapReduce computing clusters with locality, including capacity region and theoretical bounds on delay performance, not studied. In this paper, we address these problems from a stochastic network perspective. Our focus strike right balance between data-locality load-balancing simultaneously...
Memory is a crucial resource for big data processing frameworks such as Spark and M3R, where the memory used both computation caching intermediate storage data. Consequently, optimizing key to extracting high performance. The extant approach statically split based on workload profiling. This unable capture varying characteristics dynamic demands. Another factor that affects efficiency choice of placement eviction policy. LRU policy oblivious task scheduling information from analytic...
Modern cloud infrastructures live in an open world, characterized by continuous changes the environment and requirements they have to meet. Continuous occur autonomously unpredictably, are out of control provider. Therefore, advanced solutions be developed able dynamically adapt infrastructure, while providing service performance guarantees. A number autonomic computing been such that resources allocated among running applications on basis short-term demand estimates. However, only energy...
Improving data locality for MapReduce jobs is critical the performance of large-scale Hadoop clusters, embodying principle moving computation close to big platforms. Scheduling tasks in vicinity stored can significantly diminish network traffic, which crucial system stability and efficiency. Though issue on has been investigated extensively MapTasks, most existing schedulers ignore ReduceTasks when fetching intermediate data, causing degradation. This problem reducing cost identified...
In this paper, we describe our experiences and lessons learned from building a general-purpose in-memory key-value middleware, called HydraDB. HydraDB synthesizes collection of state-of-the-art techniques, including continuous fault-tolerance, Remote Direct Memory Access (RDMA), as well awareness for multicore systems, etc, to deliver high-throughput, low-latency access service in reliable manner cluster computing applications.
Abstract Accurate recognition of system operating conditions is the basis for ensuring safe and stable operation oil gas pipeline network systems. However, existing condition mainly relies on manual experience, which cannot dynamically track changes status, data stored in SCADA lacks labels, makes it difficult to explore potential value data. In this work, a hybrid neural model based multiplex visibility graphs (MVG) proposed classification. Firstly, generative adversarial algorithms (GANs)...
For MapReduce/Hadoop, map and reduce phases exhibit fundamentally distinguishing characteristics. Additionally, these two admit complicated tight dependency on each other, causing the repeatedly observed starvation problem with widely used Fair Scheduler. To mitigate this problem, we design Coupling Scheduler, which, among other new features, jointly schedules tasks by coupling their progresses, different from existing ones that treat them separately. This is based intuition allocating...
Recently, many in-memory key-value stores have started using a High-Performance network protocol, Remote Direct Memory Access (RDMA), to provision ultra-low latency access services. Among various solutions, previous studies recognized that leveraging RDMA Read optimize GET operations and continuing message passing for other requests can offer tremendous performance improvement while avoiding read-write races. However, although such design utilize the power of when there is sufficient memory...