NFDI4DS | UHH-SEMS - Publication Details

Hartwig Anzt

ORCID: 0000-0003-2177-952X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5012133869

Research Areas

Matrix Theory and Algorithms
Parallel Computing and Optimization Techniques
Distributed and Parallel Computing Systems
Numerical Methods and Algorithms
Electromagnetic Scattering and Analysis
Advanced Numerical Methods in Computational Mathematics
Scientific Computing and Data Management
Advanced Data Storage Technologies
Stochastic Gradient Optimization Techniques
Cloud Computing and Resource Management
Interconnection Networks and Systems
Model Reduction and Neural Networks
Advanced Optimization Algorithms Research
Numerical methods for differential equations
Tensor decomposition and applications
Quantum Computing Algorithms and Architecture
Low-power high-performance VLSI design
Embedded Systems Design Techniques
Research Data Management Practices
Neural Networks and Applications
Sparse and Compressive Sensing Techniques
Radiation Effects in Electronics
Algorithms and Data Compression
Polynomial and algebraic computation
Advanced Database Systems and Queries

University of Tennessee at Knoxville
2015-2024

Heilbronn University
2024

Technical University of Munich
2024

Karlsruhe Institute of Technology
2012-2024

Universitat Politècnica de València
2023

University of Tennessee System
2015

A survey of numerical linear algebra methods utilizing mixed-precision arithmetic

OPENALEX - Publications

Ahmad Abdelfattah Hartwig Anzt Erik G. Boman Erin Carson Terry Cojean and 16 more

The efficient utilization of mixed-precision numerical linear algebra algorithms can offer attractive acceleration to scientific computing applications. Especially with the hardware integration low-precision special-function units designed for machine learning applications, traditional community urgently needs reconsider floating point formats used in distinct operations efficiently leverage available compute power. In this work, we provide a comprehensive survey routines, including...

10.1177/10943420211003313 article EN The International Journal of High Performance Computing Applications 2021-03-19

Ginkgo: A Modern Linear Operator Algebra Framework for High Performance Computing

OPENALEX - Publications

Hartwig Anzt Terry Cojean Goran Flegar Fritz Göbel Thomas Grützmacher and 4 more

In this article, we present Ginkgo , a modern C++ math library for scientific high performance computing. While classical linear algebra libraries act on matrix and vector objects, ’s design principle abstracts all functionality as “linear operators,” motivating the notation of operator library.” current focus is oriented toward providing sparse graphics processing unit (GPU) architectures, but given design, can be easily extended to accommodate other algorithms hardware architectures. We...

10.1145/3480935 article EN ACM Transactions on Mathematical Software 2022-02-16

Earth Virtualization Engines (EVE)

OPENALEX - Publications

Björn Stevens Stefan Adami Tariq Ali Hartwig Anzt Zafer Aslan and 95 more

Abstract. To manage Earth in the Anthropocene, new tools, institutions, and forms of international cooperation will be required. Virtualization Engines is proposed as an federation centers excellence to empower all people respond immense urgent challenges posed by climate change.

10.5194/essd-16-2113-2024 article EN cc-by Earth system science data 2024-04-30

Adaptive precision in block‐Jacobi preconditioning for iterative sparse linear system solvers

OPENALEX - Publications

Hartwig Anzt Jack Dongarra Goran Flegar Nicholas J. Higham Enrique S. Quintana–Ort́ı

Summary We propose an adaptive scheme to reduce communication overhead caused by data movement selectively storing the diagonal blocks of a block‐Jacobi preconditioner in different precision formats (half, single, or double). This specialized can then be combined with any Krylov subspace method for solution sparse linear systems perform all arithmetic double precision. assess effects on iteration count and transfer cost preconditioned conjugate gradient solver. A is, general, memory...

10.1002/cpe.4460 article EN Concurrency and Computation Practice and Experience 2018-03-12

Incomplete Sparse Approximate Inverses for Parallel Preconditioning

OPENALEX - Publications

Hartwig Anzt Thomas Huckle Jürgen Bräckle Jack Dongarra

10.1016/j.parco.2017.10.003 article EN publisher-specific-oa Parallel Computing 2017-10-28

Improving the Performance of CA-GMRES on Multicores with Multiple GPUs

OPENALEX - Publications

Ichitaro Yamazaki Hartwig Anzt Stanimire Tomov Mark Frederick Hoemmen Jack Dongarra

The Generalized Minimum Residual (GMRES) method is one of the most widely-used iterative methods for solving nonsymmetric linear systems equations. In recent years, techniques to avoid communication in GMRES have gained attention because comparison floating-point operations, becoming increasingly expensive on modern computers. Since graphics processing units (GPUs) are now crucial component computing, we investigate effectiveness these multicore CPUs with multiple GPUs. While present...

10.1109/ipdps.2014.48 article EN 2014-05-01

Implementation and Tuning of Batched Cholesky Factorization and Solve for NVIDIA GPUs

OPENALEX - Publications

Jakub Kurzak Hartwig Anzt Mark Gates Jack Dongarra

Many problems in engineering and scientific computing require the solution of a large number small systems linear equations. Due to their high processing power, Graphics Processing Units became an attractive target for this class problems, routines based on LU QR factorization have been provided by NVIDIA cuBLAS library. This work addresses situation where equations are symmetric positive definite. The paper describes implementation tuning kernels Cholesky forward backward substitution....

10.1109/tpds.2015.2481890 article EN IEEE Transactions on Parallel and Distributed Systems 2015-09-24

An environment for sustainable research software in Germany and beyond: current state, open challenges, and call for action

OPENALEX - Publications

Hartwig Anzt Felix Bach Stephan Druskat Frank Löffler Axel Loewe and 39 more

<ns3:p>Research software has become a central asset in academic research. It optimizes existing and enables new research methods, implements embeds knowledge, constitutes an essential product itself. Research must be sustainable order to understand, replicate, reproduce, build upon or conduct effectively. In other words, available, discoverable, usable, adaptable needs, both now the future. therefore requires environment that supports sustainability.</ns3:p><ns3:p> </ns3:p><ns3:p> Hence,...

10.12688/f1000research.23224.2 preprint EN cc-by F1000Research 2021-01-26

A BDDC Preconditioner for the Cardiac EMI Model in three Dimensions

OPENALEX - Publications

Fritz Goebel Ngoc Monica Huynh Fatemeh Chegini Luca F. Pavarino Martin Weiser and 2 more

We analyze a Balancing Domain Decomposition by Constraints (BDDC) preconditioner for the solution of three dimensional composite Discontinuous Galerkin discretizations reaction-diffusion systems ordinary and partial differential equations arising in cardiac cell-by-cell models like Extracellular space, Membrane Intracellular space (EMI) Model. These microscopic are essential understanding events aging structurally diseased hearts which macroscopic relying on homogenized descriptions tissue,...

10.48550/arxiv.2502.07722 preprint EN arXiv (Cornell University) 2025-02-11

Testing Strategies for OpenFOAM Projects

OPENALEX - Publications

Jan Wilhelm Gärtner Gregor Olenik Mohammed Elwardi Fadeli Lukas Petermann Andreas Kronenburg and 2 more

While testing is increasingly recognized as essential in scientific software development, it not yet standard practice within the OpenFOAM community for developing new solvers and features. This gap stems partly from challenges of integrating into typical workflows limited guidance on implementing effective tests. Writing tests complex like based projects presents unique obstacles, including difficulty configuring various cases. paper addresses these issues by discussing established test...

10.51560/ofj.v5.134 article EN OpenFOAM® Journal 2025-04-26

Preconditioned Krylov solvers on GPUs

OPENALEX - Publications

Hartwig Anzt Mark Gates Jack Dongarra Moritz Kreutzer Gerhard Wellein and 1 more

10.1016/j.parco.2017.05.006 article EN Parallel Computing 2017-05-29

Using Jacobi iterations and blocking for solving sparse triangular systems in incomplete factorization preconditioning

OPENALEX - Publications

Edmond Chow Hartwig Anzt J. A. Scott Jack Dongarra

10.1016/j.jpdc.2018.04.017 article EN publisher-specific-oa Journal of Parallel and Distributed Computing 2018-05-08

Load-balancing Sparse Matrix Vector Product Kernels on GPUs

OPENALEX - Publications

Hartwig Anzt Terry Cojean Yen‐Chen Chen Jack Dongarra Goran Flegar and 4 more

Efficient processing of Irregular Matrices on Single Instruction, Multiple Data (SIMD)-type architectures is a persistent challenge. Resolving it requires innovations in the development data formats, computational techniques, and implementations that strike balance between thread divergence, which inherent for Matrices, padding, alleviates performance-detrimental divergence but introduces artificial overheads. To this end, article, we address challenge designing high performance sparse...

10.1145/3380930 article EN ACM Transactions on Parallel Computing 2020-03-29

An environment for sustainable research software in Germany and beyond: current state, open challenges, and call for action

OPENALEX - Publications

Hartwig Anzt Felix Bach Stephan Druskat Frank Löffler Axel Loewe and 38 more

Research software has become a central asset in academic research. It optimizes existing and enables new research methods, implements embeds knowledge, constitutes an essential product itself. must be sustainable order to understand, replicate, reproduce, build upon or conduct effectively. In other words, available, discoverable, usable, adaptable needs, both now the future. therefore requires environment that supports sustainability. Hence, change is needed way development maintenance are...

10.12688/f1000research.23224.1 preprint EN cc-by F1000Research 2020-04-27

A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic

OPENALEX - Publications

Ahmad Abdelfattah Hartwig Anzt Erik G. Boman Erin Carson Terry Cojean and 20 more

Within the past years, hardware vendors have started designing low precision special function units in response to demand of Machine Learning community and their for high compute power formats. Also server-line products are increasingly featuring low-precision units, such as NVIDIA tensor cores ORNL's Summit supercomputer providing more than an order magnitude higher performance what is available IEEE double precision. At same time, gap between on one hand memory bandwidth other keeps...

10.48550/arxiv.2007.06674 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Earth Virtualization Engines (EVE)

OPENALEX - Publications

Björn Stevens Stefan Adami Tariq Ali Hartwig Anzt Zafer Aslan and 95 more

Abstract. To manage Earth in the Anthropocene, new tools, institutions, and forms of international cooperation will be required. Virtualization Engines are proposed as federation centers excellence to empower all people respond immense urgent challenges posed by climate change.

10.5194/essd-2023-376 preprint EN cc-by 2023-09-22

Then and Now: Improving Software Portability, Productivity, and 100× Performance

OPENALEX - Publications

Hartwig Anzt Axel Huebl Xiaoye Sherry Li

The US Exascale Computing Project (ECP) has succeeded in preparing applications to run efficiently on the first reported supercomputers world. To achieve this, it modernized whole leadership software stack, from libraries simulation codes. In this article, we contrast selected before and after ECP. We discuss how sustainable research development for computing can embrace conversation with hardware vendors, facilities, community, domain scientists who are application developers integrators of...

10.1109/mcse.2024.3387302 article EN cc-by Computing in Science & Engineering 2024-01-01

With Extreme Computing, the Rules Have Changed

OPENALEX - Publications

Jack Dongarra Stanimire Tomov Piotr Łuszczek Jakub Kurzak Mark Gates and 4 more

On the eve of exascale computing, traditional wisdom no longer applies. High-performance computing is gone as we know it. This article discusses a range new algorithmic techniques emerging in context many which defy common high-performance and are considered unorthodox, but could turn out to be necessity near future.

10.1109/mcse.2017.48 article EN Computing in Science & Engineering 2017-04-28

Accelerating the LOBPCG method on GPUs using a blocked sparse matrix vector product

OPENALEX - Publications

Hartwig Anzt Stanimire Tomov Jack Dongarra

This paper presents a heterogeneous CPU-GPU implementation for sparse iterative eigensolver -- the Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG). For key routine generating Krylov search spaces via product of matrix and block vectors, we propose GPU kernel based on modified sliced ELLPACK format. Blocking set vectors processing them simultaneously accelerates computation consecutive SpMVs significantly. Comparing performance against similar routines from Intel's MKL...

10.5555/2872599.2872609 article EN IEEE International Conference on High Performance Computing, Data, and Analytics 2015-04-12

Ginkgo: A high performance numerical linear algebra library

OPENALEX - Publications

Hartwig Anzt Terry Cojean Yen‐Chen Chen Goran Flegar Fritz Göbel and 4 more

Ginkgo is a production-ready sparse linear algebra library for high performance computing on GPU-centric architectures with level of portability and focuses software sustainability.

10.21105/joss.02260 article EN cc-by The Journal of Open Source Software 2020-08-31

Adaptive Precision Block-Jacobi for High Performance Preconditioning in the Ginkgo Linear Algebra Software

OPENALEX - Publications

Goran Flegar Hartwig Anzt Terry Cojean Enrique S. Quintana–Ort́ı

The use of mixed precision in numerical algorithms is a promising strategy for accelerating scientific applications. In particular, the adoption specialized hardware and data formats low-precision arithmetic high-end GPUs (graphics processing units) has motivated numerous efforts aiming at carefully reducing working order to speed up computations. For whose performance bound by memory bandwidth, idea compressing its before (and after) accesses received considerable attention. One store an...

10.1145/3441850 article EN ACM Transactions on Mathematical Software 2021-04-26

Accelerating collaborative filtering using concepts from high performance computing

OPENALEX - Publications

Mark Gates Hartwig Anzt Jakub Kurzak Jack Dongarra

In this paper we accelerate the Alternating Least Squares (ALS) algorithm used for generating product recommendations on basis of implicit feedback datasets. We approach with concepts proven to be successful in High Performance Computing. This includes formulation as a mix cache-optimized algorithm-specific kernels and standard BLAS routines, acceleration via graphics processing units (GPUs), use parallel batched kernels, autotuning identify performance winners. For benchmark datasets,...

10.1109/bigdata.2015.7363811 article EN 2021 IEEE International Conference on Big Data (Big Data) 2015-10-01

Coming Soon ...