NFDI4DS | UHH-SEMS - Publication Details

Sian Jin

ORCID: 0009-0009-9250-0611

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5103201662

Research Areas

Advanced Data Storage Technologies
Parallel Computing and Optimization Techniques
Algorithms and Data Compression
Advanced Data Compression Techniques
Advanced Neural Network Applications
Domain Adaptation and Few-Shot Learning
Distributed and Parallel Computing Systems
Computer Graphics and Visualization Techniques
Brain Tumor Detection and Classification
Machine Learning and Data Classification
Adversarial Robustness in Machine Learning
Generative Adversarial Networks and Image Synthesis
Advanced Vision and Imaging
Image and Signal Denoising Methods
Time Series Analysis and Forecasting
Neural Networks and Applications
Privacy-Preserving Technologies in Data
Scientific Computing and Data Management
Advanced Image Processing Techniques
Distributed systems and fault tolerance
Stochastic Gradient Optimization Techniques
Numerical Methods and Algorithms
Data Management and Algorithms
Machine Learning and Algorithms
Multimodal Machine Learning Applications

Temple University
2023-2025

Indiana University Bloomington
2022-2024

Temple College
2024

Indiana University
2023

Washington State University
2020-2022

Illinois Institute of Technology
2021

University of California, Riverside
2021

University of South Carolina Upstate
2021

University of Houston
2021

Argonne National Laboratory
2021

Design of a Quantization-Based DNN Delta Compression Framework for Model Snapshots and Federated Learning

OPENALEX - Publications

Haoyu Jin Donglei Wu Shuyu Zhang Xiangyu Zou Sian Jin and 3 more

Deep neural networks (DNNs) have achieved remarkable success in many fields. However, large-scale DNNs also bring storage costs when storing snapshots for preventing clusters' frequent failures or incur significant communication overheads transmitting the Federated Learning (FL). Recently, several approaches, such as Delta-DNN and LC-Checkpoint, aim to reduce size of DNNs' snapshot by compressing difference between two neighboring versions (a.k.a., delta). we observe that existing applying...

10.1109/tpds.2022.3230840 article EN IEEE Transactions on Parallel and Distributed Systems 2023-01-16

Understanding GPU-Based Lossy Compression for Extreme-Scale Cosmological Simulations

OPENALEX - Publications

Sian Jin Pascal Grosset Christopher M. Biwer Jesus Pulido Jiannan Tian and 2 more

To help understand our universe better, researchers and scientists currently run extreme-scale cosmology simulations on leadership supercomputers. However, such can generate large amounts of scientific data, which often result in expensive costs data associated with movement storage. Lossy compression techniques have become attractive because they significantly reduce size maintain high fidelity for post-analysis. In this paper, we propose to use GPU-based lossy cosmological simulations. Our...

10.1109/ipdps47924.2020.00021 article EN 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2020-05-01

Exploring Autoencoder-based Error-bounded Compression for Scientific Data

OPENALEX - Publications

Jinyang Liu Sheng Di Kai Zhao Sian Jin Dingwen Tao and 3 more

Error-bounded lossy compression is becoming an indispensable technique for the success of today’s scientific projects with vast volumes data produced during simulations or instrument acquisitions. Not only can it significantly reduce size, but also control errors based on user-specified error bounds. Autoencoder (AE) models have been widely used in image compression, few AE-based approaches support error-bounding features, which are highly required by applications. To address this issue, we...

10.1109/cluster48925.2021.00034 article EN 2021-09-01

High-performance Effective Scientific Error-bounded Lossy Compression with Auto-tuned Multi-component Interpolation

OPENALEX - Publications

Jinyang Liu Sheng Di Kai Zhao Xin Liang Sian Jin and 5 more

Error-bounded lossy compression has been identified as a promising solution for significantly reducing scientific data volumes upon users' requirements on distortion. For the existing error-bounded compressors, some of them (such SPERR and FAZ) can reach fairly high ratios others SZx, SZ, ZFP) feature speeds, but they rarely exhibit both ratio speed meanwhile. In this paper, we propose HPEZ with newly-designed interpolations quality-metric-driven auto-tuning, which features improved quality...

10.1145/3639259 article EN Proceedings of the ACM on Management of Data 2024-03-12

COMPSO: Optimizing Gradient Compression for Distributed Training with Second-Order Optimizers

OPENALEX - Publications

Baixi Sun Liu Wei-jin J. Gregory Pauloski Jiannan Tian Jinda Jia and 11 more

10.1145/3710848.3710852 article EN cc-by 2025-02-28

FedCSpc: A Cross-Silo Federated Learning System with Error-Bounded Lossy Parameter Compression

OPENALEX - Publications

Zhaorui Zhang Sheng Di Kai Zhao Sian Jin Dingwen Tao and 5 more

10.1109/tpds.2025.3564736 article EN IEEE Transactions on Parallel and Distributed Systems 2025-01-01

Accurate Performance Modeling and Uncertainty Analysis of Lossy Compression in Scientific Applications

OPENALEX - Publications

Youyuan Liu Ting Yang Sian Jin

10.1109/dcc62719.2025.00077 article EN 2025-03-18

DeepSZ

OPENALEX - Publications

Sian Jin Sheng Di Xin Liang Jiannan Tian Dingwen Tao and 1 more

Today's deep neural networks (DNNs) are becoming deeper and wider because of increasing demand on the analysis quality more complex applications to resolve. The wide DNNs, however, require large amounts resources (such as memory, storage, I/O), significantly restricting their utilization resource-constrained platforms. Although some DNN simplification methods weight quantization) have been proposed address this issue, they suffer from either low compression ratios or high errors, which may...

10.1145/3307681.3326608 preprint EN 2019-06-17

waveSZ

OPENALEX - Publications

Jiannan Tian Sheng Di Chengming Zhang Xin Liang Sian Jin and 3 more

Error-bounded lossy compression is critical to the success of extreme-scale scientific research because ever-increasing volumes data produced by today's high-performance computing (HPC) applications. Not only can error-controlled compressors significantly reduce I/O and storage burden but they retain high fidelity for post analysis. Existing state-of-the-art compressors, however, generally suffer from relatively low decompression throughput (up hundreds megabytes per second on a single CPU...

10.1145/3332466.3374525 article EN 2020-02-19

Improving Prediction-Based Lossy Compression Dramatically via Ratio-Quality Modeling

OPENALEX - Publications

Sian Jin Sheng Di Jiannan Tian Suren Byna Dingwen Tao and 1 more

Error-bounded lossy compression is one of the most effective techniques for reducing scientific data sizes. However, traditional trial-and-error approach used to configure compressors finding optimal trade-off between reconstructed quality and ratio prohibitively expensive. To resolve this issue, we develop a general-purpose analytical ratio-quality model based on prediction-based framework, which can effectively foresee reduced ratio, as well impact compressed post-hoc analysis quality. Our...

10.1109/icde53745.2022.00232 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2022-05-01

Optimizing Error-Bounded Lossy Compression for Scientific Data on GPUs

OPENALEX - Publications

Jiannan Tian Sheng Di Xiaodong Yu Cody Rivera Kai Zhao and 5 more

Error-bounded lossy compression is a critical technique for significantly reducing scientific data volumes. With ever-emerging heterogeneous high-performance computing (HPC) architecture, GPU-accelerated error-bounded compressors (such as CUSZ and cuZFP) have been developed. However, they suffer from either low performance or ratios. To this end, we propose CUSZ+ to target both high ratios throughputs. We identify that sparsity smoothness are key factors Our contributions in work fourfold:...

10.1109/cluster48925.2021.00047 article EN 2021-09-01

cuSZ

OPENALEX - Publications

Jiannan Tian Sheng Di Kai Zhao Cody Rivera Megan Hickman Fulp and 6 more

Error-bounded lossy compression is a state-of-the-art data reduction technique for HPC applications because it not only significantly reduces storage overhead but also can retain high fidelity postanalysis. Because supercomputers and are becoming heterogeneous using accelerator-based architectures, in particular GPUs, several development teams have recently released GPU versions of their compressors. However, existing GPU-based compressors suffer from either low decompression throughput or...

10.1145/3410463.3414624 preprint EN 2020-09-30

Concealing Compression-accelerated I/O for HPC Applications through In Situ Task Scheduling

OPENALEX - Publications

Sian Jin Sheng Di Frédéric Vivien Daoce Wang Yves Robert and 2 more

Lossy compression and asynchronous I/O are two of the most effective solutions for reducing storage overhead enhancing performance in large-scale high-performance computing (HPC) applications. However, current approaches have limitations that prevent them from fully leveraging lossy compression, they may also result task collisions, which restrict overall HPC To address these issues, we propose an optimization approach scheduling problem encompasses computation, I/O. Our algorithm adaptively...

10.1145/3627703.3629573 preprint EN 2024-04-18

A Survey on Error-Bounded Lossy Compression for Scientific Datasets

OPENALEX - Publications

Sheng Di Jinyang Liu Kai Zhao Xin Liang Robert Underwood and 20 more

Error-bounded lossy compression has been effective in significantly reducing the data storage/transfer burden while preserving reconstructed fidelity very well. Many error-bounded compressors have developed for a wide range of parallel and distributed use cases years. These are designed with distinct models design principles, such that each them features particular pros cons. In this paper we provide comprehensive survey emerging techniques different involving big to process. The key...

10.48550/arxiv.2404.02840 preprint EN arXiv (Cornell University) 2024-04-03

COMET

OPENALEX - Publications

Sian Jin Chengming Zhang Xintong Jiang Yunhe Feng Hui Guan and 3 more

Deep neural networks (DNNs) are becoming increasingly deeper, wider, and non-linear due to the growing demands on prediction accuracy analysis quality. Training wide deep require large amounts of storage resources such as memory because intermediate activation data must be saved in during forward propagation then restored for backward propagation. However, state-of-the-art accelerators GPUs only equipped with very limited capacities hardware design constraints, which significantly limits...

10.14778/3503585.3503597 article EN Proceedings of the VLDB Endowment 2021-12-01

Adaptive Configuration of In Situ Lossy Compression for Cosmology Simulations via Fine-Grained Rate-Quality Modeling

OPENALEX - Publications

Sian Jin Jesus Pulido Pascal Grosset Jiannan Tian Dingwen Tao and 1 more

Extreme-scale cosmological simulations have been widely used by today's researchers and scientists on leadership supercomputers. A new generation of error-bounded lossy compressors has in workflows to reduce storage requirements minimize the impact throughput limitations while saving large snapshots high-fidelity data for post-hoc analysis. In this paper, we propose adaptively provide compression configurations compute partitions with newly designed post-analysis aware rate-quality modeling....

10.1145/3431379.3460653 preprint EN 2021-06-17

ClickTrain

OPENALEX - Publications

Chengming Zhang Geng Yuan Wei Niu Jiannan Tian Sian Jin and 6 more

Convolutional neural networks (CNNs) are becoming increasingly deeper, wider, and non-linear because of the growing demand on prediction accuracy analysis quality. The wide deep CNNs, however, require a large amount computing resources processing time. Many previous works have studied model pruning to improve inference performance, but little work has been done for effectively reducing training cost. In this paper, we propose ClickTrain: an efficient accurate end-to-end framework CNNs....

10.1145/3447818.3459988 preprint EN 2021-06-03

AMRIC: A Novel In Situ Lossy Compression Framework for Efficient I/O in Adaptive Mesh Refinement Applications

OPENALEX - Publications

Daoce Wang Jesus Pulido Pascal Grosset Jiannan Tian Sian Jin and 9 more

As supercomputers advance towards exascale capabilities, computational intensity increases significantly, and the volume of data requiring storage transmission experiences exponential growth. Adaptive Mesh Refinement (AMR) has emerged as an effective solution to address these two challenges. Concurrently, error-bounded lossy compression is recognized one most efficient approaches tackle latter issue. Despite their respective advantages, few attempts have been made investigate how AMR can...

10.1145/3581784.3613212 article EN 2023-10-30

Optimizing Error-Bounded Lossy Compression for Scientific Data With Diverse Constraints

OPENALEX - Publications

Yuanjian Liu Sheng Di Kai Zhao Sian Jin Cheng Wang and 4 more

Vast volumes of data are produced by today's scientific simulations and advanced instruments. These cannot be stored transferred efficiently because limited I/O bandwidth, network speed, storage capacity. Error-bounded lossy compression can an effective method for addressing these issues: not only it significantly reduce size, but also control the distortion based on user-defined error bounds. In practice, many applications have specific requirements or constraints compression, in order to...

10.1109/tpds.2022.3194695 article EN IEEE Transactions on Parallel and Distributed Systems 2022-07-28

Delta-DNN: Efficiently Compressing Deep Neural Networks via Exploiting Floats Similarity

OPENALEX - Publications

Zhenbo Hu Xiangyu Zou Wen Xia Sian Jin Dingwen Tao and 3 more

Deep neural networks (DNNs) have gained considerable attention in various real-world applications due to the strong performance on representation learning. However, a DNN needs be trained many epochs for pursuing higher inference accuracy, which requires storing sequential versions of DNNs and releasing updated users. As result, large amounts storage network resources are required, significantly hampering utilization resource-constrained platforms (e.g., IoT, mobile phone).

10.1145/3404397.3404408 article EN 2020-08-09

A novel memory-efficient deep learning training framework via error-bounded lossy compression

OPENALEX - Publications

Sian Jin Guanpeng Li Shuaiwen Leon Song Dingwen Tao

DNNs are becoming increasingly deeper, wider, and nonlinear due to the growing demands on prediction accuracy analysis quality. When training a DNN model, intermediate activation data must be saved in memory during forward propagation then restored for backward propagation. Traditional saving techniques such as recomputation migration either suffers from high performance overhead or is constrained by specific interconnect technology limited bandwidth. In this paper, we propose novel...

10.1145/3437801.3441597 article EN 2021-02-17

Multifacets of lossy compression for scientific data in the Joint-Laboratory of Extreme Scale Computing

OPENALEX - Publications

Franck Cappello Mario Acosta Emmanuel Agullo Hartwig Anzt Jon C. Calhoun and 15 more

10.1016/j.future.2024.05.022 article EN Future Generation Computer Systems 2024-06-14

GWLZ: A Group-wise Learning-based Lossy Compression Framework for Scientific Data

OPENALEX - Publications

Wenqi Jia Sian Jin Jinzhen Wang Wei Niu Dingwen Tao and 1 more

10.1145/3659995.3660041 article EN 2024-06-03

Coming Soon ...