NFDI4DS | UHH-SEMS - Publication Details

Guanpeng Li

ORCID: 0000-0001-7773-7826

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5025726299

Research Areas

Radiation Effects in Electronics
Parallel Computing and Optimization Techniques
Security and Verification in Computing
Advanced Neural Network Applications
Adversarial Robustness in Machine Learning
Distributed systems and fault tolerance
Advanced Data Storage Technologies
Software Reliability and Analysis Research
Software Testing and Debugging Techniques
Domain Adaptation and Few-Shot Learning
Distributed and Parallel Computing Systems
Algorithms and Data Compression
VLSI and Analog Circuit Testing
Brain Tumor Detection and Classification
Smart Grid Security and Resilience
Generative Adversarial Networks and Image Synthesis
2D Materials and Applications
Thermodynamic and Exergetic Analyses of Power and Cooling Systems
Reproductive Biology and Fertility
Advanced Thermoelectric Materials and Devices
Advanced Thermodynamic Systems and Engines
Cloud Data Security Solutions
Autonomous Vehicle Technology and Safety
Integrated Circuits and Semiconductor Failure Analysis
Low-power high-performance VLSI design

University of Iowa
2020-2025

Argonne National Laboratory
2024

Southern University of Science and Technology
2023-2024

National Supercomputing Center in Shenzhen
2024

Hainan University
2023

Shandong Electric Power Engineering Consulting Institute Corp
2021-2023

China Power Engineering Consulting Group (China)
2021-2023

Zhengzhou University
2022

University of British Columbia
2014-2020

Chinese Academy of Sciences
2015-2017

Understanding error propagation in deep learning neural network (DNN) accelerators and applications

OPENALEX - Publications

Guanpeng Li Siva Kumar Sastry Hari Michael B. Sullivan Timothy Tsai Karthik Pattabiraman and 2 more

Deep learning neural networks (DNNs) have been successful in solving a wide range of machine problems. Specialized hardware accelerators proposed to accelerate the execution DNN algorithms for high-performance and energy efficiency. Recently, they deployed datacenters (potentially business-critical or industrial applications) safety-critical systems such as self-driving cars. Soft errors caused by high-energy particles increasing systems, these can lead catastrophic failures systems....

10.1145/3126908.3126964 article EN 2017-11-08

Quantifying the Accuracy of High-Level Fault Injection Techniques for Hardware Faults

OPENALEX - Publications

Jiesheng Wei Anna Thomas Guanpeng Li Karthik Pattabiraman

Hardware errors are on the rise with reducing feature sizes, however tolerating them in hardware is expensive. Researchers have explored software-based techniques for building error resilient applications. Many of these leverage application-specific resilience characteristics to keep overheads low. Understanding requires software fault-injection mechanisms that both accurate and capable operating at a high-level abstraction allow developers reason about resilience. In this paper, we quantify...

10.1109/dsn.2014.2 article EN 2014-06-01

Thermoelectric properties of SnSe2monolayer

OPENALEX - Publications

Guanpeng Li Guangqian Ding Guoying Gao

The 2H (MoS2-type) phase of 2D transition metal dichalcogenides (TMDCs) has been extensively studied and exhibits excellent electronic optoelectronic properties, but the high phonon thermal conductivity is detrimental to thermoelectric performances. Here, we use first-principles methods combined with Boltzmann transport theory calculate phononic properties 1T (CdI2-type) SnSe2 monolayer, a recently realized dichalcogenide semiconductor. calculated band gap 0.85 eV, which little larger than...

10.1088/0953-8984/29/1/015001 article EN Journal of Physics Condensed Matter 2016-11-10

BinFI

OPENALEX - Publications

Zitao Chen Guanpeng Li Karthik Pattabiraman Nathan DeBardeleben

As machine learning (ML) becomes pervasive in high performance computing, ML has found its way into safety-critical domains (e.g., autonomous vehicles). Thus the reliability of grown importance. Specifically, failures systems can have catastrophic consequences, and occur due to soft errors, which are increasing frequency system scaling. Therefore, we need evaluate presence errors.

10.1145/3295500.3356177 article EN 2019-11-07

A Low-cost Fault Corrector for Deep Neural Networks through Range Restriction

OPENALEX - Publications

Zitao Chen Guanpeng Li Karthik Pattabiraman

The adoption of deep neural networks (DNNs) in safety-critical domains has engendered serious reliability concerns. A prominent example is hardware transient faults that are growing frequency due to the progressive technology scaling, and can lead failures DNNs. This work proposes Ranger, a low-cost fault corrector, which directly rectifies faulty output without re-computation. DNNs inherently resilient benign (which will not cause corruption), but critical result erroneous output). Ranger...

10.1109/dsn48987.2021.00018 article EN 2021-06-01

Understanding Error Propagation in GPGPU Applications

OPENALEX - Publications

Guanpeng Li Karthik Pattabiraman Chen-Yang Cher Pradip Bose

GPUs have emerged as general-purpose accelerators in high-performance computing (HPC) and scientific applications. However, the reliability characteristics of GPU applications not been investigated depth. While error propagation has extensively for non-GPU applications, a very different programming model which can significant effect on them. We perform an empirical study to understand characterize build compilerbased fault-injection tool track propagation, define metrics find exhibit some...

10.1109/sc.2016.20 article EN 2016-11-01

Modeling Soft-Error Propagation in Programs

OPENALEX - Publications

Guanpeng Li Karthik Pattabiraman Siva Kumar Sastry Hari Michael B. Sullivan Timothy Tsai

As technology scales to lower feature sizes, devices become more susceptible soft errors. Soft errors can lead silent data corruptions (SDCs), seriously compromising the reliability of a system. Traditional hardware-only techniques avoid SDCs are energy hungry, and hence not suitable for commodity systems. Researchers have proposed selective software-based protection tolerate hardware faults at costs. However, these either use expensive fault injection or inaccurate analytical models...

10.1109/dsn.2018.00016 article EN 2018-06-01

Strain-induced enhancement of thermoelectric performance of TiS2monolayer based on first-principles phonon and electron band structures

OPENALEX - Publications

Guanpeng Li K.L. Yao Guoying Gao

Using first-principle calculations combined with Boltzmann transport theory, we investigate the biaxial strain effect on electronic and phonon thermal properties of a 1 T (CdI2-type) structural TiS2 monolayer, recent experimental two-dimensional (2D) material. It is found that band structure can be effectively modulated gap experiences an indirect−direct−indirect transition increasing tensile strain. The convergence induced by increases Seebeck coefficient power factor, while lattice...

10.1088/1361-6528/aa99ba article EN Nanotechnology 2017-11-10

TensorFI: A Flexible Fault Injection Framework for TensorFlow Applications

OPENALEX - Publications

Zitao Chen Niranjhana Narayanan Bo Fang Guanpeng Li Karthik Pattabiraman and 1 more

As machine learning (ML) has seen increasing adoption in safety-critical domains (e.g., autonomous vehicles), the reliability of ML systems also grown importance. While prior studies have proposed techniques to enable efficient error-resilience selective instruction duplication), a fundamental requirement for realizing these is detailed understanding application's resilience. In this work, we present TensorFI, high-level fault injection (FI) framework TensorFlow-based applications. TensorFI...

10.1109/issre5003.2020.00047 article EN 2020-10-01

TensorFI: A Configurable Fault Injector for TensorFlow Applications

OPENALEX - Publications

Guanpeng Li Karthik Pattabiraman Nathan DeBardeleben

Machine Learning (ML) applications have emerged as the killer for next generation hardware and software platforms, there is a lot of interest in frameworks to build such applications. TensorFlow high-level dataflow framework building ML has become most popular one recent past. are also being increasingly used safety-critical systems self-driving cars home robotics. Therefore, compelling need evaluate resilience built using TensorFlow. In this paper, we fault injection called TensorFI...

10.1109/issrew.2018.00024 article EN 2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW) 2018-10-01

Understanding error propagation in GPGPU applications

OPENALEX - Publications

Guanpeng Li Karthik Pattabiraman Chen-Yang Cher Pradip Bose

GPUs have emerged as general-purpose accelerators in high-performance computing (HPC) and scientific applications. However, the reliability characteristics of GPU applications not been investigated depth. While error propagation has extensively for non-GPU applications, a very different programming model which can significant effect on them. We perform an empirical study to understand characterize build compiler-based fault-injection tool track propagation, define metrics find exhibit some...

10.5555/3014904.3014932 article EN IEEE International Conference on High Performance Computing, Data, and Analytics 2016-11-13

GEREM: Fast and Precise Error Resilience Assessment for GPU Microarchitectures

OPENALEX - Publications

Jingweijia Tan Xinjun Li An Zhong Kaige Yan Xiaohui Wei and 1 more

10.1109/tpds.2025.3552679 article EN IEEE Transactions on Parallel and Distributed Systems 2025-01-01

Half-metals and half-semiconductors in a transition metal doped SnSe2 monolayer: a first-principles study

OPENALEX - Publications

Xuming Wu Jiangchao Han Yulin Feng Guanpeng Li Cong Wang and 2 more

Recently, a new two-dimensional (2D) semiconductor SnSe<sub>2</sub> monolayer has been grown by molecular beam epitaxy, and weak ferromagnetic behavior above room temperature in Mn-doped thin films was also observed experimentally.

10.1039/c7ra07648g article EN cc-by-nc RSC Advances 2017-01-01

Understanding Error Propagation in Deep-Learning Neural Networks Accelerators and Applications

OPENALEX - Publications

Guanpeng Li Siva Kumar Sastry Hari Michael B. Sullivan Timothy Tsai Karthik Pattabiraman and 2 more

10.1109/mdat.2025.3544537 article EN IEEE Design and Test 2025-01-01

Modeling Input-Dependent Error Propagation in Programs

OPENALEX - Publications

Guanpeng Li Karthik Pattabiraman

Transient hardware faults are increasing in computer systems due to shrinking feature sizes. Traditional methods mitigate such through duplication, which incurs huge overhead performance and energy consumption. Therefore, researchers have explored software solutions as selective instruction require fine-grained analysis of vulnerabilities Silent Data Corruptions (SDCs). These typically evaluated via Fault Injection (FI), is often highly time-consuming. Hence, most studies confine their...

10.1109/dsn.2018.00038 article EN 2018-06-01

PID-Piper: Recovering Robotic Vehicles from Physical Attacks

OPENALEX - Publications

Pritam Dash Guanpeng Li Zitao Chen Mehdi Karimibiuki Karthik Pattabiraman

Robotic Vehicles (RV) rely extensively on sensor inputs to operate autonomously. Physical attacks such as tampering and spoofing can feed erroneous measurements deviate RVs from their course result in mission failures. In this paper, we present PID-Piper, a novel framework for automatically recovering physical attacks. We use machine learning (ML) design an attack resilient Feed-Forward Controller (FFC), which runs tandem with the RV's primary controller monitors it. Under attacks, FFC takes...

10.1109/dsn48987.2021.00020 article EN 2021-06-01

cuSZp: An Ultra-fast GPU Error-bounded Lossy Compression Framework with Optimized End-to-End Performance

OPENALEX - Publications

Yafan Huang Sheng Di Xiaodong Yu Guanpeng Li Franck Cappello

Modern scientific applications and supercomputing systems are generating large amounts of data in various fields, leading to critical challenges storage footprints communication times. To address this issue, error-bounded GPU lossy compression has been widely adopted, since it can reduce the volume within a customized threshold on distortion. In work, we propose an ultra-fast compressor cuSZp. Specifically, cuSZp computes linear recurrences with hierarchical parallelism fuse massive...

10.1145/3581784.3607048 article EN 2023-10-30

Fault Injection for TensorFlow Applications

OPENALEX - Publications

Niranjhana Narayanan Zitao Chen Bo Fang Guanpeng Li Karthik Pattabiraman and 1 more

As machine learning (ML) has seen increasing adoption in safety-critical domains (e.g., autonomous vehicles), the reliability of ML systems also grown importance. While prior studies have proposed techniques to enable efficient error-resilience selective instruction duplication), a fundamental requirement for realizing these is detailed understanding application's resilience. In this work, we present TensorFI 1 and 2, high-level fault injection (FI) frameworks TensorFlow-based applications....

10.1109/tdsc.2022.3175930 article EN IEEE Transactions on Dependable and Secure Computing 2022-07-18

A Feature-Driven Fixed-Ratio Lossy Compression Framework for Real-World Scientific Datasets

OPENALEX - Publications

Md Hasanur Rahman Sheng Di Kai Zhao Robert Underwood Guanpeng Li and 1 more

Today's scientific applications and advanced instruments are producing extremely large volumes of data everyday, so that error-controlled lossy compression has become a critical technique to the storage management. Existing compressors, however, designed mainly based on error-control driven mechanism, which cannot be efficiently applied in fixed-ratio use-case, where desired ratio needs reached because restricted processing/management resources such as limited memory/storage capacity network...

10.1109/icde55515.2023.00116 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2023-04-01

Long working hours and all-cause mortality in China: A 26-year follow-up study

OPENALEX - Publications

Yeen Huang Yingping Xiang Wei Zhou Guanpeng Li Chengzhi Zhao and 2 more

OBJECTIVES: The relationship between long working hours and the risk of mortality has been debated in various countries. This study aimed to investigate association all-cause a large population-based cohort China. METHODS: retrospective (N=10 269) used large, nationally representative data set [the China Health Nutrition Surveys (CHNS)] from 1989 2015. Long (≥55 per week) were compared standard (35–40 week). outcome measure was mortality. Hazard ratio (HR) for calculated Cox proportional...

10.5271/sjweh.4115 article EN cc-by Scandinavian Journal of Work Environment & Health 2023-09-04

Investigating the impact of transient hardware faults on deep learning neural network inference

OPENALEX - Publications

Md Hasanur Rahman Sabuj Laskar Guanpeng Li

Summary Safety‐critical applications, such as autonomous vehicles, healthcare, and space have witnessed widespread deployment of deep neural networks (DNNs). Inherent algorithmic inaccuracies consistently been a prevalent cause misclassifications, even in modern DNNs. Simultaneously, with an ongoing effort to minimize the footprint contemporary chip design, there is continual rise likelihood transient hardware faults deployed DNN models. Consequently, researchers wondered extent which these...

10.1002/stvr.1873 article EN cc-by Software Testing Verification and Reliability 2024-02-01

A Survey on Error-Bounded Lossy Compression for Scientific Datasets

OPENALEX - Publications

Sheng Di Jinyang Liu Kai Zhao Xin Liang Robert Underwood and 20 more

Error-bounded lossy compression has been effective in significantly reducing the data storage/transfer burden while preserving reconstructed fidelity very well. Many error-bounded compressors have developed for a wide range of parallel and distributed use cases years. These are designed with distinct models design principles, such that each them features particular pros cons. In this paper we provide comprehensive survey emerging techniques different involving big to process. The key...

10.48550/arxiv.2404.02840 preprint EN arXiv (Cornell University) 2024-04-03

Diagnosis-guided Attack Recovery for Securing Robotic Vehicles from Sensor Deception Attacks

OPENALEX - Publications

Pritam Dash Guanpeng Li Mehdi Karimibiuki Karthik Pattabiraman

10.1145/3634737.3644997 article EN 2024-06-28

NVMe-oPF: Designing Efficient Priority Schemes for NVMe-over-Fabrics with Multi-Tenancy Support

OPENALEX - Publications

Darren Ng Andrew Lin Arjun Kashyap Guanpeng Li Xiaoyi Lu

10.1109/ipdps57955.2024.00052 article EN 2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2024-05-27

Fine-Grained Characterization of Faults Causing Long Latency Crashes in Programs

OPENALEX - Publications

Guanpeng Li Qining Lu Karthik Pattabiraman

As the rate of transient hardware faults increases, researchers have investigated software techniques to tolerate these faults. An important class are those that cause long- latency crashes (LLCs), or can persist for a long time in program before causing it crash. In this paper, we develop technique automatically find locations where LLC originate so be protected bound program's crash latency. We first identify code patterns responsible majority through an empirical study. then build...

10.1109/dsn.2015.36 article EN 2015-06-01

Coming Soon ...