Brian Homerding

ORCID: 0000-0002-5455-6181
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Parallel Computing and Optimization Techniques
  • Advanced Data Storage Technologies
  • Distributed and Parallel Computing Systems
  • Software System Performance and Reliability
  • Cloud Computing and Resource Management
  • Green IT and Sustainability
  • Software Engineering Research
  • Logic, programming, and type systems
  • Embedded Systems Design Techniques
  • Software Testing and Debugging Techniques
  • Distributed systems and fault tolerance
  • Scientific Computing and Data Management
  • Graph Theory and Algorithms

Argonne National Laboratory
2018-2024

Northwestern University
2022-2023

Future HPC leadership computing systems for the United States Department of Energy will utilize GPUs acceleration scientific codes. These from various vendors which places a large focus on performance portability programming models used by application developers. In domain, SYCL is an open C++ standard heterogeneous that gaining support. This fueling growing interest in understanding toolchains GPU vendors.

10.1145/3388333.3388660 article EN 2020-04-27

We evaluate the capabilities of vendor-provided OpenCL implementations for performance portability across multiple computing platforms. The Rodinia benchmark suite is used this evaluation. apply metric defined by Pennycook et al., and we use roofline efficiency from Roofline model as "performance efficiency" in metric's definition. found that delivered similar several benchmarks, even if roofline-based efficiencies platforms are very different among benchmarks. To help distinguish between...

10.1109/ipdpsw50202.2020.00067 article EN 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2020-05-01

This paper will evaluate the progress being made on achieving performance portability by a sub-set of ECP applications, or their related mini-apps, across diverse spectrum applications domains and approaches to portability. The mini-apps evaluated are AMR-Wind, HACC, SW4, GAMESS RI-MP2, XSBench, TestSNAP. These codes redeveloped using SYCL, OpenMP, RAJA, Kokkos programming models, AMReX framework in this we assess AMD MI100, Intel Gen9, NVIDIA A100 GPUs. Since each GPU has different...

10.1109/p3hpc54578.2021.00008 article EN 2021-11-01

Modern and emerging architectures demand increasingly complex compiler analyses transformations. As the emphasis on infrastructure moves beyond support for peephole optimizations extraction of instruction-level parallelism, compilers should custom tools designed to meet these demands with higher-level analysis-powered abstractions functionalities wider program scope. This paper introduces NOELLE, a robust open-source domain-independent compilation layer built upon LLVM providing this...

10.1109/cgo53902.2022.9741276 article EN 2022-03-29

OpenMP implementations make increasing demands on the kernel. We take next step and consider bringing into Our vision is that entire application, run-time system, a kernel framework interwoven to become kernel, allowing implementation full advantage of hardware in custom manner. compare contrast three approaches achieving this goal. The first, runtime (RTK), ports any code use pragmas. second, process (PIK) adds specialized abstraction for running user-level within third, compilation (CCK),...

10.1145/3458817.3476183 article EN 2021-10-21

A compiler's intermediate representation (IR) defines a program's execution plan by encoding its instructions and their relative order. Compiler optimizations aim to replace given with semantically-equivalent one that increases the performance for target architecture. Alternative representations of an IR, like Program Dependence Graph (PDG), aid this process capturing minimum set constraints plans must satisfy. Parallel programming OpenMP extends sequential adding possibility running in...

10.48550/arxiv.2402.00986 preprint EN arXiv (Cornell University) 2024-02-01

To achieve high performance, modern HPC systems take advantage of heterogeneous GPU architectures. Often these GPUs are programmed using a vendor preferred parallel programming model. Unfortunately, this often results in application code that is not portable across vendors. address issue, open models have been introduced. One such model provided by the RAJA Portability Suite. portability layer provides an abstract developer API as library through C++. In RAJA, computational kernels lowered...

10.1145/3648115.3648131 article EN 2024-04-05

Modern programming languages offer abstractions that simplify software development and allow hardware to reach its full potential. These range from the well-established OpenMP language extensions newer C++ features like smart pointers. To properly use these in an existing codebase, programmers must determine how a given source code region interacts with Program State Elements (PSEs) (i.e., program's variables memory locations). We call this process Element Characterization (PSEC). Without...

10.1145/3579990.3580011 article EN 2023-02-17

Manually writing parallel programs is difficult and error-prone. Automatic parallelization could address this issue, but profitability can be limited by not having facts known only to the programmer. A parallelizing compiler that collaborates with programmer increase coverage performance of while reducing errors overhead associated manual parallelization. Unlike collaboration involving analysis tools report program properties or make suggestions programmer, decompiler-based leverage strength...

10.1145/3582016.3582058 article EN 2023-03-20
Coming Soon ...