- Advanced Data Storage Technologies
- Caching and Content Delivery
- Distributed and Parallel Computing Systems
- Parallel Computing and Optimization Techniques
- Cloud Computing and Resource Management
- Distributed systems and fault tolerance
- Anomaly Detection Techniques and Applications
- Cellular Automata and Applications
- Video Surveillance and Tracking Methods
- Peer-to-Peer Network Technologies
- Adversarial Robustness in Machine Learning
- Data Stream Mining Techniques
- Magnetic properties of thin films
- Cloud Data Security Solutions
- Network Security and Intrusion Detection
- Image Enhancement Techniques
- Advanced Neural Network Applications
- Algorithms and Data Compression
- Human Pose and Action Recognition
- Image and Video Quality Assessment
- Sensor Technology and Measurement Systems
- Advanced Computational Techniques and Applications
- Neural Networks and Applications
- Machine Learning and Data Classification
- Human Motion and Animation
Huazhong University of Science and Technology
2016-2025
Wuhan National Laboratory for Optoelectronics
2016-2025
Ministry of Education of the People's Republic of China
2012
Data Storage Institute
2008-2009
Memory disaggregation is a promising architecture for modern datacenters that separates compute and memory resources into independent pools connected by ultra-fast networks, which can improve utilization, reduce cost, enable elastic scaling of resources. However, existing solutions based on remote direct access (RDMA) suffer from high latency additional overheads including page faults code refactoring. Emerging cache-coherent interconnects such as CXL offer opportunities to reconstruct...
Enabling object detectors to recognize out-of-distribution (OOD) objects is vital for building reliable systems. A primary obstacle stems from the fact that models frequently do not receive supervisory signals unfamiliar data, leading overly confident predictions regarding OOD objects. Despite previous progress estimates uncertainty based on detection model and in-distribution (ID) samples, we explore using pre-trained vision-language representations object-level detection. We first discuss...
Cloud computing is an elastic model that the users can lease resources from rentable infrastructure. gaining popularity due to its lower cost, high reliability and huge availability. To utilize powerful capability of cloud computing, this paper import it into data mining machine learning field. As one most influential open competition in area, Netflix Prize attached with mass storage had driven thousands teams across world attack problem, among which final winner was BellKor's Pragmatic...
NAND flash memory is widely used in various computing systems. However, blocks can sustain only a limited number of program/erase (P/E) cycles, which are referred to as the endurance. On one hand, order ensure data integrity, manufacturers often define maximum P/E cycles worst block endurance blocks. other exhibit large variations, introduce two serious problems. The first problem that error correcting code (ECC) over-provisioned, it has be designed tolerate case causes longer decoding...
This paper proposes Shoggoth, an efficient edge-cloud collaborative architecture, for boosting inference performance on real-time video of changing scenes. Shoggoth uses online knowledge distillation to improve the accuracy models suffering from data drift and offloads labeling process cloud, alleviating constrained resources edge devices. At edge, we design adaptive training using small batches adapt under limited computing power, sampling frames robustness reducing bandwidth. The...
Disk data density improvement will eventually be limited by the super-paramagnetic effect for perpendicular recording. While various approaches to this problem have been proposed, Shingled Magnetic Recording (SMR) holds great promise mitigate of scaling cost-effectively overlapping tracks. However, inherent properties SMR limit Write (SWD) applicability since writing one track destroys previously-stored on As a result, layout management designs proposed. In paper, we present hybrid wave-like...
This paper proposes a new SSD cache architecture, DEFT-cache, Delayed Erasing and Fast Taping, that maximizes I/O performance reliability of RAID storage. First all, DEFT-Cache exploits the inherent physical properties flash memory by making use old data have been overwritten but still in existence to minimize small write penalty RAID5/6. As pages being SSD, are invalidated become candidates for erasure garbage collections. Our idea is selectively delay let these otherwise useless contribute...
To expand the capacity of a RAID-5 array with additional disks, data have to be migrated between disks leverage extra space and performance gain. Conventional methods for expanding are very slow because they migrate almost all existing recalculate parity blocks. This paper proposes new online expansion method RAID-5, named parity-based migration (PBM). only migrates blocks that form special parallelogram one side consisting When adding m n PBM achieves minimal which needs move m/(n+m)...
nand flash-based storage devices have gained a lot of popularity in recent years. Unfortunately, flash blocks suffer from limited endurance. For guaranteeing reliability, manufactures also prescribe specified number program and erase (P/E) cycles to define the endurance within same chip. To extend service lifetime device, existing works assume that take P/E-based wear-leveling algorithms which evenly distribute P/E cycle across controller. However, many studies indicate exhibit wide...
As disk volume grows rapidly with terabyte becoming a norm, RAID reconstruction time in case of failure takes prohibitively long time. This paper presents new architecture, S <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> -RAID, allowing the array to reconstruct very quickly failure. The idea is form skewed sub RAIDs (S -RAID) structure so that can be done parallel dramatically speeding up data and hence minimizing chance loss. To make...
Areal density scaling in magnetic hard drives isin jeopardy as particles become unstable when they are sufficiently small. Shingled recording holds great promise to mitigate the problem of cost-effectively by overlapping data tracks. However, this innovative technology suffers severely from slow small writes. This prevents shingle being widely adapted practice. paper presents a new hybrid storage architecture that combines shingled-recording disk and fast SSD cache achieve high-capacity...
As disk volume grows rapidly with terabyte becoming a norm, RAID reconstruction process in case of failure takes prohibitively long time. This paper presents new architecture, S <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> -RAID, allowing the array to reconstruct very quickly failure. The idea is form skewed sub-arrays structure so that can be done parallel dramatically speeding up data and hence minimizing chance loss. We analyse...
3-D NAND flash memory is gradually being widely used in solid state drives (SSDs), leading to increasing storage capacity. However, the read performance of SSD sacrificed for decoding operations which are executed guarantee data reliability. No matter whether have bit errors, they will be sent error correcting code (ECC) engine decode, introducing a high delay SSD. Error prechecking can help avoid redundant error-free data, but it induces extra checking overhead data. Motivated by this, we...
The current power managements in RAID array are mostly designed to conserve energy by spinning down partial disks of standard architecture. However, several not only decreases disk parallelism, but also creates new problems, for example, chunks the stripe cannot be accessed directly or multiple same stored on disk, which affect spatial locality. We refer these problems as degradation, results further performance degradation. To avoid such this paper proposes a storage architecture called...
Learned Index, which utilizes effective machine learning models to accelerate locating sorted data positions, has gained increasing attention in many big scenarios. Using efficient learned models, the indexes build large nodes and flat structures, thereby greatly improving performance. However, most of state-of-the-art are designed for DRAM, there is hence an urgent need enable high-performance emerging Non-Volatile Memory (NVM). In this article, we first evaluate analyze performance...
It is well-known that with the explosive growth of data, age big data has arrived. How to save huge amounts great importance both industry and academia. This paper puts forward a solution based on coding technologies in system store lot cold data. By studying existing systems, we can not only maintain system's reliability, but also improve security utilization storage systems. Due remarkable reliability space saving rate technologies, importing schema systems becomes prerequisite. In our...
In modern replication storage systems where data carries two or more multiple copies, a primary group of disks is always up to service incoming requests while other are often spun down sleep states save energy during slack periods. However, since new writes cannot be immediately synchronized onto all disks, system reliability degraded. This paper develops PERAID, high-performance, energy-efficient system, which aims improve both performance and efficiency without compromising reliability. It...
As more public cloud computing platforms are emerging in the market, a great challenge for these Infrastructure as Server (IaaS) providers is how to measure cost and charge Software Service (SaaS) clients services. This problem compounded virtualization technology deployed many consolidate servers improve their utilization. paper studies three different but related models apportioning costs private or environment supported by virtualized data centers. With given workload placement scenarios...
As datacenters grow in scale, increasing energy costs and carbon emissions have led data centers to seek renewable energy, such as wind solar energy. However, tackling the challenges associated with intermittency variability of is difficult. This paper proposes a scheme called GreenMatch, which deploys an SSD cache match green supplies time-shifting workload schedule while maintaining low latency for online data-intensive services. With cache, process latency-sensitive request access disk...
To protect data in cloud storage, fault tolerance and efficient recovery become very important. Recent studies have developed numerous solutions based on erasure code techniques to solve this problem using functional repairs. However, there are two limitations address. The first one is consistency since the Encoding Matrix (EM) different among clouds. other repairing bandwidth, which a concern for most of us. We addressed these problems from both theoretical practical perspectives. BMCloud,...
To satisfy the explosive growth of data in large-scale centers, where redundant arrays independent disks (RAIDs), especially RAID-5, are widely deployed, effective storage scaling and disk expansion methods desired. However, a way to reduce migration overhead maintain reliability original RAID major concerns scaling. address these problems, we propose new scheme, H-Scale, achieve fast via hybrid stripe layouts. H-Scale takes advantage loose restriction structures choose migrated create...
Out-of-distribution (OOD) detection aims at enhancing standard deep neural networks to distinguish anomalous inputs from original training data. Previous progress has introduced various approaches where the in-distribution data and even several OOD examples are prerequisites. However, due privacy security, auxiliary tends be impractical in a real-world scenario. In this paper, we propose data-free method without on natural data, called Class-Conditional Impressions Reappearing (C2IR), which...