NFDI4DS | UHH-SEMS - Publication Details

Jorge Parra

ORCID: 0000-0003-1852-4286

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5054038123

Research Areas

Parallel Computing and Optimization Techniques
Radiation Effects in Electronics
Advanced Data Storage Technologies
Superconducting Materials and Applications
Reliability and Maintenance Optimization
Semiconductor materials and devices
Adversarial Robustness in Machine Learning
Particle accelerators and beam dynamics
Distributed and Parallel Computing Systems
Atomic and Subatomic Physics Research
Recycling and Waste Management Techniques
Integrated Circuits and Semiconductor Failure Analysis
Management and Optimization Techniques

Intel (United States)
2022-2023

Intel (United Kingdom)
2023

HPC Hardware Design Reliability Benchmarking With HDFIT

OPENALEX - Publications

Patrik Omland Alessio Netti Yang Peng Andrea Baldovin Michael Paulitsch and 4 more

Chips pack ever more, smaller transistors. Fault rates increase in turn and become more concerning, particularly at the scale of <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">High-Performance Computing</i> (HPC) systems: on one hand, hardware fault protection is costly - than 10% silicon area for floating-point units; other, HPC users expect correct application output after anticipated time computation, but workloads are seldom...

10.1109/tpds.2023.3237777 article EN IEEE Transactions on Parallel and Distributed Systems 2023-01-17

API-Based Hardware Fault Simulation for DNN Accelerators

OPENALEX - Publications

Patrik Omland Yang Peng Michael Paulitsch Jorge Parra Gustavo Espinosa and 3 more

This article presents an application program interface (API)-based hardware fault simulation method to investigate the effect of faults on failure probability deep neural network (DNN) accelerators. —Fei Su, Intel Corporation

10.1109/mdat.2022.3180977 article EN IEEE Design and Test 2022-06-08

Implementation of the AMORMS Diagnostic Tool (Asset Management Operational Reliability and Maintenance Survey) in Recycled Beverage Container Manufacturing Lines

OPENALEX - Publications

Carlos Parra Carlos Morán Félix Pizarro Pablo Duque Andrés Aránguiz and 2 more

The effectiveness of a comprehensive maintenance and reliability management process can be assessed through an in-depth analysis various factors that, collectively, represent the contribution to operational production processes industrial asset. There are no simple formulas for designing integrated model within asset framework (in accordance with ISO 55001 standard), nor there fixed or universal rules that apply equally all assets over time. In light this, primary goal this article is...

10.20944/preprints202410.2123.v1 preprint EN 2024-10-28

Mixed Precision Support in Hpc Applications: What About Reliability?

OPENALEX - Publications

Alessio Netti Peng Yang Patrik Omland Michael Paulitsch Jorge Parra and 4 more

In their quest for exascale and beyond, High-Performance Computing (HPC) systems continue becoming ever larger more complex. Application developers, on the other hand, leverage novel methods to improve efficiency of own codes: a recent trend is use floating-point mixed precision, or careful interlocking single- double-precision arithmetic, as tool performance well reduce network memory boundedness. However, while it known that modern HPC suffer hardware faults at daily rates, impact reduced...

10.2139/ssrn.4409803 preprint EN 2023-01-01

Mixed precision support in HPC applications: What about reliability?

OPENALEX - Publications

Alessio Netti Yang Peng Patrik Omland Michael Paulitsch Jorge Parra and 4 more

10.1016/j.jpdc.2023.104746 article EN Journal of Parallel and Distributed Computing 2023-07-25

Coming Soon ...