NFDI4DS | UHH-SEMS - Publication Details

Brian Homerding

ORCID: 0000-0002-5455-6181

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5045515249

Research Areas

Parallel Computing and Optimization Techniques
Advanced Data Storage Technologies
Distributed and Parallel Computing Systems
Software System Performance and Reliability
Cloud Computing and Resource Management
Green IT and Sustainability
Software Engineering Research
Logic, programming, and type systems
Embedded Systems Design Techniques
Software Testing and Debugging Techniques
Distributed systems and fault tolerance
Scientific Computing and Data Management
Graph Theory and Algorithms

Argonne National Laboratory
2018-2024

Northwestern University
2022-2023

Evaluating the Performance of the hipSYCL Toolchain for HPC Kernels on NVIDIA V100 GPUs

OPENALEX - Publications

Brian Homerding John Tramm

Future HPC leadership computing systems for the United States Department of Energy will utilize GPUs acceleration scientific codes. These from various vendors which places a large focus on performance portability programming models used by application developers. In domain, SYCL is an open C++ standard heterogeneous that gaining support. This fueling growing interest in understanding toolchains GPU vendors.

10.1145/3388333.3388660 article EN 2020-04-27

Performance Portability Evaluation of OpenCL Benchmarks across Intel and NVIDIA Platforms

OPENALEX - Publications

Colleen Bertoni JaeHyuk Kwack Thomas Applencourt Yasaman Ghadar Brian Homerding and 5 more

We evaluate the capabilities of vendor-provided OpenCL implementations for performance portability across multiple computing platforms. The Rodinia benchmark suite is used this evaluation. apply metric defined by Pennycook et al., and we use roofline efficiency from Roofline model as "performance efficiency" in metric's definition. found that delivered similar several benchmarks, even if roofline-based efficiencies platforms are very different among benchmarks. To help distinguish between...

10.1109/ipdpsw50202.2020.00067 article EN 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2020-05-01

Evaluation of Performance Portability of Applications and Mini-Apps across AMD, Intel and NVIDIA GPUs

OPENALEX - Publications

JaeHyuk Kwack John Tramm Colleen Bertoni Yasaman Ghadar Brian Homerding and 3 more

This paper will evaluate the progress being made on achieving performance portability by a sub-set of ECP applications, or their related mini-apps, across diverse spectrum applications domains and approaches to portability. The mini-apps evaluated are AMR-Wind, HACC, SW4, GAMESS RI-MP2, XSBench, TestSNAP. These codes redeveloped using SYCL, OpenMP, RAJA, Kokkos programming models, AMReX framework in this we assess AMD MI100, Intel Gen9, NVIDIA A100 GPUs. Since each GPU has different...

10.1109/p3hpc54578.2021.00008 article EN 2021-11-01

NOELLE Offers Empowering LLVM Extensions

OPENALEX - Publications

Angelo Matni Enrico Armenio Deiana Yian Su Lukas Gross Souradip Ghosh and 8 more

Modern and emerging architectures demand increasingly complex compiler analyses transformations. As the emphasis on infrastructure moves beyond support for peephole optimizations extraction of instruction-level parallelism, compilers should custom tools designed to meet these demands with higher-level analysis-powered abstractions functionalities wider program scope. This paper introduces NOELLE, a robust open-source domain-independent compilation layer built upon LLVM providing this...

10.1109/cgo53902.2022.9741276 article EN 2022-03-29

Paths to OpenMP in the kernel

OPENALEX - Publications

Jiacheng Ma Wenyi Wang Aaron Nelson Michael Cuevas Brian Homerding and 5 more

OpenMP implementations make increasing demands on the kernel. We take next step and consider bringing into Our vision is that entire application, run-time system, a kernel framework interwoven to become kernel, allowing implementation full advantage of hardware in custom manner. compare contrast three approaches achieving this goal. The first, runtime (RTK), ports any code use pragmas. second, process (PIK) adds specialized abstraction for running user-level within third, compilation (CCK),...

10.1145/3458817.3476183 article EN 2021-10-21

The Parallel Semantics Program Dependence Graph

OPENALEX - Publications

Brian Homerding Atmn Patel Enrico Armenio Deiana Yian Su Zujun Tan and 4 more

A compiler's intermediate representation (IR) defines a program's execution plan by encoding its instructions and their relative order. Compiler optimizations aim to replace given with semantically-equivalent one that increases the performance for target architecture. Alternative representations of an IR, like Program Dependence Graph (PDG), aid this process capturing minimum set constraints plans must satisfy. Parallel programming OpenMP extends sequential adding possibility running in...

10.48550/arxiv.2402.00986 preprint EN arXiv (Cornell University) 2024-02-01

Enabling RAJA on Intel GPUs with SYCL

OPENALEX - Publications

Brian Homerding Arturo Vargas Tom Scogland Robert Chen Mike Davis and 1 more

To achieve high performance, modern HPC systems take advantage of heterogeneous GPU architectures. Often these GPUs are programmed using a vendor preferred parallel programming model. Unfortunately, this often results in application code that is not portable across vendors. address issue, open models have been introduced. One such model provided by the RAJA Portability Suite. portability layer provides an abstract developer API as library through C++. In RAJA, computational kernels lowered...

10.1145/3648115.3648131 article EN 2024-04-05

Program State Element Characterization

OPENALEX - Publications

Enrico Armenio Deiana Brian Suchy Michael Wilkins Brian Homerding Tommy McMichen and 4 more

Modern programming languages offer abstractions that simplify software development and allow hardware to reach its full potential. These range from the well-established OpenMP language extensions newer C++ features like smart pointers. To properly use these in an existing codebase, programmers must determine how a given source code region interacts with Program State Elements (PSEs) (i.e., program's variables memory locations). We call this process Element Characterization (PSEC). Without...

10.1145/3579990.3580011 article EN 2023-02-17

SPLENDID: Supporting Parallel LLVM-IR Enhanced Natural Decompilation for Interactive Development

OPENALEX - Publications

Zujun Tan Yebin Chon Michael Kruse Johannes Doerfert Ziyang Xu and 3 more

Manually writing parallel programs is difficult and error-prone. Automatic parallelization could address this issue, but profitability can be limited by not having facts known only to the programmer. A parallelizing compiler that collaborates with programmer increase coverage performance of while reducing errors overhead associated manual parallelization. Unlike collaboration involving analysis tools report program properties or make suggestions programmer, decompiler-based leverage strength...

10.1145/3582016.3582058 article EN 2023-03-20

Coming Soon ...