- Cloud Computing and Resource Management
- Advanced Data Storage Technologies
- Distributed systems and fault tolerance
- Caching and Content Delivery
- Blockchain Technology Applications and Security
- IoT and Edge/Fog Computing
- Distributed and Parallel Computing Systems
- Software-Defined Networks and 5G
- Scientific Computing and Data Management
- Advanced Image and Video Retrieval Techniques
- Cryptography and Data Security
- Data Quality and Management
- Cloud Data Security Solutions
- Advanced Malware Detection Techniques
- Diverse Aspects of Tourism Research
- Web Data Mining and Analysis
- Security and Verification in Computing
- Data Stream Mining Techniques
- Network Security and Intrusion Detection
- Multimedia Communication and Technology
- Recommender Systems and Techniques
- Parallel Computing and Optimization Techniques
- Cruise Tourism Development and Management
- QR Code Applications and Technologies
- Visual Attention and Saliency Detection
Shandong Xiehe University
2019-2023
Hewlett Packard Enterprise (United States)
2023
Tokyo Metropolitan University
2023
Nanjing University
2021-2022
National University of Singapore
2015-2021
NARI Group (China)
2013-2020
Chinese Academy of Sciences
2020
Institute of Computing Technology
2020
Korea University
2018
Hewlett-Packard (United States)
1999-2017
Existing blockchain systems scale poorly because of their distributed consensus protocols. Current attempts at improving scalability are limited to cryptocurrency. Scaling under general workloads (i.e., non-cryptocurrency applications) remains an open question. This work takes a principled approach apply sharding in order improve transaction throughput scale. is challenging, however, due the fundamental difference failure models between databases and blockchain. To achieve our goal, we first...
The success of Bitcoin and other cryptocurrencies bring enormous interest to blockchains. A blockchain system implements a tamper-evident ledger for recording transactions that modify some global states. captures entire evolution history the management history, also known as data provenance or lineage, has been studied extensively in database systems. However, querying existing blockchains can only be done by replaying all transactions. This approach is applicable large-scale, offline...
Existing data storage systems offer a wide range of functionalities to accommodate an equally diverse applications. However, new classes applications have emerged, e.g., blockchain and collaborative analytics, featuring versioning, fork semantics, tamper-evidence or any combination thereof. They present opportunities for efficiently support such by embedding the above requirements into storage. In this paper, we ForkBase , engine designed forkable By integrating core application properties...
With 5G on the verge of being adopted as next mobile network, there is a need to analyze its impact landscape computing and data management. In this paper, we both traditional emerging technologies project our view future research challenges opportunities. predicted increase 10-100× in bandwidth 5-10x decrease latency, expected be main enabler for smart cities, IoT efficient healthcare, where machine learning conducted at edge. context, investigate how can help development federated...
Efficient and scalable stream joins play an important role in performing real-time analytics for many cloud applications. However, like conventional database processing, online theta-joins over data streams are computationally expensive moreover, being memory-based they impose high memory requirement on the system. In this paper, we propose a novel join model, called join-biclique, which organizes large cluster as complete bipartite graph. Join-biclique has several strengths state-of-the-art...
Blockchain has come a long way - system that was initially proposed specifically for cryptocurrencies is now being adapted and adopted as general-purpose transactional system. As blockchain evolves into another data management system, the natural question how it compares against distributed database systems. Existing works on this comparison focus high-level properties, such security throughput. They stop short of showing underlying design choices contribute to overall differences. Our work...
Shared-nothing architecture has been widely used in distributed databases to achieve good scalability. While it offers superior performance for local transactions, the overhead of processing transactions can degrade system significantly. The key contributor degradation is expensive two-phase commit (2PC) protocol ensure atomic commitment transactions. In this paper, we propose a transaction management scheme called LEAP avoid 2PC within processing. Instead across multiple nodes, converts...
The widely adopted single-threaded OLTP model assigns a single thread to each static partition of the database for processing transactions in partition. This simplifies concurrency control while retaining parallelism. However, it suffers performance loss arising from skewed workloads as well that span multiple partitions. In this paper, we present dynamic in-memory system, called LADS, extends simplicity model. key innovation LADS is separation dependency resolution and execution into two...
In this paper, we present PaxosStore, a high-availability storage system developed to support the comprehensive business of WeChat. It employs combinational design in layer engage multiple engines constructed for different models. PaxosStore is characteristic extracting Paxos-based distributed consensus protocol as middleware that universally accessible underlying multi-model engines. This facilitates tuning, maintaining, scaling and extending According our experience engineering practice,...
With the ever-increasing adoption of machine learning for data analytics, maintaining a pipeline is becoming more complex as both datasets and trained models evolve with time. In collaborative environment, changes updates due to evolution often cause cumbersome coordination maintenance work, raising costs making it hard use. Existing solutions, unfortunately, do not address version problem, especially in environment where non-linear control semantics are necessary isolate operations made by...
Existing data storage systems offer a wide range of functionalities to accommodate an equally diverse applications. However, new classes applications have emerged, e.g., blockchain and collaborative analytics, featuring versioning, fork semantics, tamper-evidence or any combination thereof. They present opportunities for efficiently support such by embedding the above requirements into storage. In this paper, we ForkBase, engine specifically designed provide efficient forkable By integrating...
Effective overload control for large-scale online service system is crucial protecting the backend from overload. Conventionally design of ad-hoc individual service. However, service-specific could be detrimental to overall due intricate dependencies or flawed implementation Service developers usually have difficulty accurately estimate dynamics actual workload during development Therefore, it essential decouple logic. In this paper, we propose DAGOR, an scheme designed account-oriented...
System virtualization, which provides good isolation, is now widely used in server consolidation. Meanwhile, one of the hot topics this field to extend virtualization for embedded systems. However, current popular platforms do not support real-time operating systems such as Linux well because platform ware, will bring low-performance I/O and high scheduling latency. The goal paper optimize Xen be system friendly. We improve two aspects platform. First, we xen scheduler manage latency...
The success of Bitcoin and other cryptocurrencies bring enormous interest to blockchains. A blockchain system implements a tamper-evident ledger for recording transactions that modify some global states. captures the entire evolution history management history, also known as data provenance or lineage, has been studied extensively in database systems. However, querying existing blockchains can only be done by replaying all transactions. This approach is feasible large-scale, offline...
It is crucial to minimize virtualization overhead for virtual machine deployment. The conventional ×86 CPU incapable of classical trap-and-emulate virtualization, leading that paravirtualization was the optimal strategy formerly. Since architectural extensions are introduced support hardware assisted becomes a competitive alternative method. Hardware superior in and memory yet still valuable some aspects as it capable shortening disposal path I/O virtualization. Thus we propose hybrid which...
Today's storage systems expose abstractions which are either too low-level (e.g., key-value store, raw-block store) that they require developers to re-invent the wheels, or high-level relational databases, Git) lack generality support many classes of applications. In this work, we propose and implement a general distributed data system, called UStore, has rich semantics. UStore delivers three key properties, namely immutability, sharing security, unify add values today's applications, also...
Data collaboration activities typically require systematic or protocol-based coordination to be scalable. Git, an effective enabler for collaborative coding, has been attested its success in countless projects around the world. Hence, applying Git philosophy general data beyond coding is motivating. We call it data. However, original design handles at file granule, which considered too coarse-grained many database applications. argue that should co-designed with systems. To this end, we...