Huizhang Luo

ORCID: 0000-0003-2392-0267
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Parallel Computing and Optimization Techniques
  • Advanced Data Storage Technologies
  • Algorithms and Data Compression
  • Distributed and Parallel Computing Systems
  • Advanced Memory and Neural Computing
  • Interconnection Networks and Systems
  • Computer Graphics and Visualization Techniques
  • Numerical Methods and Algorithms
  • Advanced Neural Network Applications
  • Model Reduction and Neural Networks
  • Semiconductor materials and devices
  • Cloud Computing and Resource Management
  • Ferroelectric and Negative Capacitance Devices
  • Target Tracking and Data Fusion in Sensor Networks
  • Advanced Data Compression Techniques
  • Magnetic properties of thin films
  • Caching and Content Delivery
  • Distributed systems and fault tolerance
  • Scientific Computing and Data Management
  • Fire Detection and Safety Systems
  • Topic Modeling
  • Video Surveillance and Tracking Methods

Hunan University
2021-2025

National Supercomputing Center in Wuxi
2023

New Jersey Institute of Technology
2018-2022

Chongqing University
2013-2018

Scientific simulations generate large amounts of floating-point data, which are often not very compressible using the traditional reduction schemes, such as deduplication or lossless compression. The emergence lossy compression holds promise to satisfy data demand from HPC applications; however, has been widely adopted in science production. We believe a fundamental reason is that there lack understanding benefits, pitfalls, and performance on scientific data. In this paper, we conduct...

10.1109/ipdps.2018.00044 article EN 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2018-05-01

In the realm of multimodal multi-object tracking (MOT) applications based on point clouds and images, current research predominantly focuses enhancing accuracy, often neglecting issue computational efficiency. Consequently, these models struggle to exhibit optimal capabilities in scenarios demanding high real-time performance. To address challenges, this paper introduces a fast model fusion (MF-Net). The is divided into three primary modules: object detection, fusion, trajectory matching....

10.1109/tvt.2024.3375457 article EN IEEE Transactions on Vehicular Technology 2024-03-11

Spin-transfer torque random access memory (STT-RAM) is considered as a promising candidate to replace SRAM the next generation cache since it has better scalability and lower leakage power. Recently, 2-bit multi-level cell (MLC) STT-RAM been proposed further increase data density. However, key drawback for MLC that magnetization directions of its hard soft domains cannot be flipped two opposite simultaneously, which leads two-step problem in state transitions. Two-step transitions would...

10.1145/2897937.2898106 article EN 2016-05-25

With the high volume and velocity of scientific data produced on high-performance computing systems, it has become increasingly critical to improve compression performance. Leveraging general tolerance reduced accuracy in applications, lossy compressors can achieve much higher ratios with a user-prescribed error bound. However, they are still far from satisfying reduction requirements applications. In this paper, we propose evaluate idea that need be preconditioned prior compression, such...

10.1109/ipdps.2019.00039 article EN 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2019-05-01

Scientific simulations on high-performance computing (HPC) systems generate vast amounts of floating-point data that need to be reduced in order lower the storage and I/O cost. Lossy compressors trade accuracy for reduction performance have been demonstrated effective reducing volume. However, a key hurdle wide adoption lossy is trade-off between compression performance, particularly ratio, not well understood. Consequently, domain scientists often exhaust many possible error bounds before...

10.1109/tpds.2019.2938503 article EN publisher-specific-oa IEEE Transactions on Parallel and Distributed Systems 2019-08-30

Scientific simulations on high-performance computing systems produce vast amounts of data that need to be stored and analyzed efficiently. Lossy compression significantly reduces the volume by trading accuracy for performance. Despite recent success lossy compression, such as ZFP SZ, performance is still far from being able keep up with exponential growth data. This paper aims further take advantage application characteristics, an area often under-explored, improve ratios adaptive mesh...

10.1109/ipdps49936.2021.00048 article EN 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2021-05-01

Non-volatile memories (NVMs), such as phase change memory (PCM) and resistive random access (ReRAM), have emerged promising technologies for replacements of DRAM due to their advantages, better scalability, zero cell leakage, DRAM-comparable read latency. Furthermore, multiple level (MLC) NVMs offer high data density capacity over single (SLC) NVM-s. However, the adoption MLC is limited by programming energy latency well low endurance. In this paper, we propose an enhanced (2 <sup...

10.1109/aspdac.2018.8297381 article EN 2018 23rd Asia and South Pacific Design Automation Conference (ASP-DAC) 2018-01-01

The emerging Phase Change Memory (PCM) is considered as a promising candidate to replace DRAM the next generation main memory since it has better scalability and lower leakage power. However, high write power consumption become challenge in adopting PCM memory. In addition fact that writing cells requires current voltage, loss charge pumps (CPs) also contributes large percentage of consumption. pumping efficiency chip concave function current. Based on characteristics function, overall can...

10.1109/aspdac.2016.7428053 article EN 2016-01-01

Today's scientific simulations are confronting seriously limited I/O bandwidth, network and storage capacity because of immense volumes data generated in high-performance computing systems. Data compression has emerged as one the most effective approaches to resolve issue exponential increase data. However, existing state-of-the-art compressors also low throughput, especially under trend growing disparities between compute rates. Among them, embedded coding is widely applied, which...

10.1109/ipdps54959.2023.00107 article EN 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2023-05-01

The emerging Phase Change Memory (PCM) is considered as one of the most promising candidates to replace DRAM main memory due its better scalability and nonvolatility. With multi-bit storage capability, Multiple-Level-Cell (MLC) PCM outperforms Single-Level-Cell (SLC) in density. However, high write latency a performance bottleneck for MLC two reasons. First, has much longer programming time. Second, latencies different transitions cell states range widely. When cells are concurrently written...

10.1109/nvmsa.2015.7304373 article EN 2015-08-01

High-performance computing (HPC) applications generate large amounts of floating-point data that need to be stored and analyzed efficiently extract the insights advance knowledge discovery. With growing disparities between compute I/O, optimizing storage stack alone may not suffice cure I/O problem. There has been a strong push in HPC communities perform reduction before is transmitted order lower cost. However, as now, neither lossless nor lossy compressors can achieve adequate ratio...

10.1109/lcos.2018.2855118 article EN Letters of the IEEE Computer Society 2018-01-01

Scientific simulations on high performance computing (HPC) platforms generate large quantities of data. To bridge the widening gap between compute and I/O, enable data to be more efficiently stored analyzed, simulation outputs need refactored, reduced, appropriately mapped storage tiers. However, a systematic solution support these steps has been lacking in current HPC software ecosystem. that end, this paper develops SIRIUS, progressive JPEG-like management scheme for storing analyzing big...

10.1109/tmscs.2018.2886851 article EN IEEE Transactions on Multi-Scale Computing Systems 2018-10-01

Non-volatile memories (NVMs), such as phase change memory (PCM) and resistive random access (ReRAM), have emerged promising technologies for replacements of DRAM due to their advantages, better scalability, zero cell leakage, DRAM-comparable read latency. Furthermore, multiple level (MLC) NVMs offer high data density capacity over single (SLC) NVM-s. However, the adoption MLC is limited by programming energy latency well low endurance. In this paper, we propose an enhanced (23}2/4 WOM code...

10.5555/3201607.3201737 article EN Asia and South Pacific Design Automation Conference 2018-01-22

The emerging Phase Change Memory (PCM) is considered to be a promising candidate replace DRAM as the next generation main memory due its higher scalability and lower leakage power. However, high write power consumption has become major challenge in adopting PCM memory. In addition fact that writing cells requires current voltage, loss charge pumps also contributes large percentage of consumption. pumping efficiency chip concave function current. Leveraging characteristics function, overall...

10.1145/3200139 article EN ACM Transactions on Storage 2018-08-31

Data compression can efficiently reduce the memory and persistence storage cost, which is highly desirable in modern computing systems, such as enterprise, cloud, High-Performance Computing (HPC) environments. However, main challenges of existing data compressors are insufficient ratio low throughput. This paper focuses on improving state-of-the-art lossy algorithms from view applications. Besides, we also use characteristics applications to runtime overhead. To this end, explore idea with...

10.1109/tc.2023.3297442 article EN IEEE Transactions on Computers 2023-07-20

Sparse tensors are prevalent in real-world applications, often characterized by their large-scale, high-order, and high-dimensional nature. Directly handling raw is impractical due to the significant memory computational overhead involved. The current mainstream approach involves compressing or decomposing original tensor. One popular tensor decomposition algorithm Tucker decomposition. However, existing state-of-the-art algorithms for large-scale typically relax optimization problem into...

10.48550/arxiv.2404.10087 preprint EN arXiv (Cornell University) 2024-04-15

10.1109/icdm59182.2024.00042 article EN 2021 IEEE International Conference on Data Mining (ICDM) 2024-12-09

Scientific simulations on high-performance computing systems produce vast amounts of data that need to be stored and analyzed efficiently. Lossy compression significantly reduces the volume by trading accuracy for performance. Despite recent success lossy compressions, such as ZFP SZ, performance is still far from being able keep up with exponential growth data. This article aims further take advantage application characteristics, an area often under-explored, improve ratios adaptive mesh...

10.1109/tpds.2022.3168386 article EN IEEE Transactions on Parallel and Distributed Systems 2022-04-19
Coming Soon ...