- Caching and Content Delivery
- Advanced Data Storage Technologies
- Peer-to-Peer Network Technologies
- Blockchain Technology Applications and Security
- Cloud Computing and Resource Management
- Distributed systems and fault tolerance
- Advanced Graph Neural Networks
- Privacy-Preserving Technologies in Data
- Cryptography and Data Security
- Cloud Data Security Solutions
- Data Management and Algorithms
- IoT and Edge/Fog Computing
- Advanced Database Systems and Queries
- Complex Network Analysis Techniques
- Spam and Phishing Detection
- Data Quality and Management
- Opinion Dynamics and Social Influence
- Human Mobility and Location-Based Analysis
- Authorship Attribution and Profiling
- Stochastic Gradient Optimization Techniques
- Opportunistic and Delay-Tolerant Networks
- Cellular Automata and Applications
- Web Data Mining and Analysis
- Retinal Imaging and Analysis
- Parallel Computing and Optimization Techniques
Antea Group (France)
2021-2024
Zhejiang Environmental Monitoring Center
2024
Shandong University
2023
Zhejiang Science and Technology Information Institute
2023
Alibaba Group (China)
2021
Agency for Science, Technology and Research
2014-2019
Institute of High Performance Computing
2018-2019
Data Storage Institute
2013-2018
A*STAR Graduate Academy
2017
National University of Singapore
2010-2011
Blockchain performance cannot meet the requirement nowadays. One of crucial ways to improve is sharding. However, most blockchain sharding research focuses on public blockchain. As for consortium blockchain, previous studies support high cross-shard efficiency, cross-contract flexibility, shard availability, and strict transaction atomicity, which are essential requirements but also challenges in systems. Facing these challenges, we propose Meepo, a systematic study sharded Meepo enhances...
We have designed and developed OceanBase, a distributed relational database system from the very basics for decade. Being scale-out multi-tenant system, OceanBase is cross-region fault tolerant, which based on shared-nothing architecture. Besides sharing many similar goals with alternative DBMS, such as horizontal scalability, fault-tolerance, etc., our design has been driven by demands of typical RDBMS compatibility well both on-premise off-premise deployments. fulfilled its goal. It...
Cloud computing represents a paradigm shift driven by the increasing demand of Web based applications for elastic, scalable and efficient system architectures that can efficiently support their ever-growing data volume large-scale analysis. A typical management has to deal with real-time updates individual users, as well periodical large scale analytical processing, indexing, extraction. While such operations may take place in same domain, design development systems have somehow evolved...
Evolvement of blockchain technology has greatly changed the network and it makes many applications to be distributed, decentralized without loss security. Ethereum is an open-source platform that provides a runtime environment for running smart contracts, which called Virtual Machine (EVM). Ethereum-based are usually referred as Decentralized Applications (DApps), since they based on EVM, its contracts. Meanwhile, distributed data store also evolves fast with technology. Distributed storage...
Blockchain has attracted lots of attention in recent years. However, the performance blockchain cannot meet requirement massive Internet Things (IoT) devices. One important bottlenecks is limited computing resources on a single server while executing transactions. To address this issue, we propose Aeolus to achieve distributed execution There are two key challenges achieving for IoT blockchain: transaction structure and state consistency. Facing these challenges, first structure, which...
Blockchain performance cannot meet the requirement nowadays. One of crucial ways to improve is sharding. However, most blockchain sharding research focuses on public blockchain. As for consortium blockchain, previous studies support high cross-shard efficiency, multiple-shard contract calling, strict transaction atomicity, and shard availability, which are essential requirements but also challenges in systems. Facing these challenges, we propose Meepo, a systematic study sharded Meepo...
Skyline queries are capable of retrieving interesting points from a large data set according to multiple criteria. Most work on skyline so far has assumed centralized storage, whereas in practice relevant often distributed among geographically scattered sites. In this work, we tackle constrained large-scale environments without the assumption any overlay structures, and propose novel algorithm named PaDSkyline (Parallel query processing). significantly shortens response time by performing...
User identification is very helpful for building a better profile of user. Some works have been devoted to this issue. However, the existing with good performance are mainly based on rich online data and do not consider cost acquisition. In paper, we aim address issue lower A machine learning-based solution proposed solely user's display names. It consists three key steps: first analyze users' unique naming patterns that lead information redundancies across sites; second, construct features...
With the development of blockchain, more and blockchain types emerge: public consortium private blockchain. Because node trust in some a no byzantine fault tolerance algorithm KRaft(Kademlia-Raft) with high throughput scalability is proposed. KRaft consensus Raft-like that preserves logic part Raft algorithm. It optimized leader election process through established K-Bucket relationships Kademlia protocol, improved speed throughput. Firstly, uses K-bucket by protocol to achieve stable...
In the ongoing evolution of OceanBase database system, it is essential to enhance its adaptability small-scale enterprises. The system has demonstrated stability and effectiveness within Ant Group other commercial organizations, besides through TPC-C TPC-H tests. this paper, we have designed a stand-alone distributed integrated architecture named Paetica address overhead caused by components in mode, with respect system. enables adaptive configuration that allows support both serial parallel...
Vertical federated learning (VFL) is an emerging paradigm for cross-silo organizations to build more accurate machine (ML) models. In this setting, multiple (i.e., parties) hold the same set of samples with different features. However, parties may have redundant or highly correlated features, leading inefficient and ineffective VFL model training. Effective feature selection in therefore essential mitigate such a problem improve effectiveness, as well computation communication efficiency. To...
Vertical federated learning (VFL) trains model when the features of data samples are scattered over multiple clients. To improve efficiency, a promising approach is to find coreset and use it as smaller training set. However, existing methods produce large there many clients have long running time. address these problems, we propose HaCore for efficient construction in VFL setting. first employs locality sensitive hashing (LSH) map bit signatures locally on clients, then merges local...
In recent years, Graph Neural Networks (GNNs) have achieved remarkable success in many graph mining tasks. However, scaling them to large graphs is challenging due the high computational and storage costs of repeated feature propagation non-linear transformation during training. One commonly employed approach address this challenge model-simplification, which only executes Propagation (P) once pre-processing, Combine (C) these receptive fields different ways then feed into a simple model for...
Innovations such as the Cloud, Internet of Things (IoT) and data analytics have already dramatically altered customer experience in many, if not all, industries. Blockchain, another emerging technology, is expected to be next generation infrastructure established trusted multiparty collaborations. In this paper, we investigated convergence aforementioned technologies, by presenting a prototype fine-grained transportation insurance. Insurance premium were assessed based on vehicles usage...
Accurate segmentation of retinal layer boundaries can facilitate the detection patients with early ophthalmic disease. Typical algorithms operate at low resolutions without fully exploiting multi-granularity visual features. Moreover, several related studies do not release their datasets that are key for research on deep learning-based solutions. We propose a novel end-to-end network based ConvNeXt, which retain more feature map details by using new depth-efficient attention module and...
Efficient and scalable distributed metadata management is critically important to overall system performance in large-scale file systems, especially the EB-scale era. Hash-based mapping subtree partitioning are state-of-the-art schemes. evenly distributes workload among servers, but it eliminates all hierarchical locality of metadata. Subtree does not uniformly distribute needs be migrated keep load balanced roughly. Distributed relatively difficult since has guarantee consistency....
Blockchain is first introduced by Bitcoin in 2009 and developers all around the world have been trying to apply blockchain different areas, like finance services, credit ownership management, resource sharing, investment Internet of Things (IoT) etc. Ethereum a 2.0 platform that allows build Decentralized Application (DApp) without building new from scratch. IoT technology embed physical devices with sensors chips provide automation process via machine-to-machine communication. Blynk...
An interesting problem in peer-based data management is efficient support for skyline queries within a multiattribute space. A query retrieves from set of multidimensional points subset points, compared to which no other are better. Skyline play an important role multi-criteria decision making and user preference applications. In this paper, we address the computing structured P2P network. We exploit iMinMax(thetas) transformation map high-dimensional 1-dimensional values. All transformed...
Efficient and scalable distributed metadata management is critically important to overall system performance in large-scale storage systems, especially the EB era. Traditional state-of-the-art schemes include hash-based mapping subtree partitioning. The former evenly distributes workload among servers, but it eliminates all hierarchical locality of metadata. It cannot efficiently handle some operations, e.g., renaming or moving a directory that requires be migrated servers. latter does not...