Kevin Harms

ORCID: 0000-0002-3019-7532
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Data Storage Technologies
  • Parallel Computing and Optimization Techniques
  • Distributed and Parallel Computing Systems
  • Cloud Computing and Resource Management
  • Distributed systems and fault tolerance
  • Interconnection Networks and Systems
  • Scientific Computing and Data Management
  • Caching and Content Delivery
  • Combustion and flame dynamics
  • Advanced Combustion Engine Technologies
  • Embedded Systems Design Techniques
  • Software System Performance and Reliability
  • Defense, Military, and Policy Studies
  • Military, Security, and Education Studies
  • Military and Defense Studies
  • Education and Military Integration
  • Catalytic Processes in Materials Science
  • Cloud Data Security Solutions
  • International Human Rights and Reproductive Law
  • Trauma, Hemostasis, Coagulopathy, Resuscitation
  • Workplace Violence and Bullying
  • Radiation Effects in Electronics
  • Effects of Environmental Stressors on Livestock
  • Sexual Assault and Victimization Studies
  • Scheduling and Optimization Algorithms

Argonne National Laboratory
2016-2025

Argonne Leadership Computing Facility
2016-2024

Lawrence Berkeley National Laboratory
2021

Office of Scientific and Technical Information
2012

Computational science applications are driving a demand for increasingly powerful storage systems. While many techniques available capturing the I/O behavior of individual application trial runs and specific components system, continuous characterization production system remains daunting challenge systems with hundreds thousands compute cores multiple petabytes storage. As result, these often designed without clear understanding diverse computational workloads they will support. In this...

10.1145/2027066.2027068 article EN ACM Transactions on Storage 2011-10-01

Today's top high performance computing systems run applications with hundreds of thousands processes, contain storage nodes, and must meet massive I/O requirements for capacity performance. These leadership-class face daunting challenges to deploying scalable systems. In this paper we present a case study the scalability on Intrepid, IBM Blue Gene/P system at Argonne Leadership Computing Facility. Listed in 5 fastest supercomputers 2008, Intrepid runs computational science intensive demands...

10.1145/1654059.1654100 article EN 2009-11-14

We examine the I/O behavior of thousands supercomputing applications "in wild," by analyzing Darshan logs over a million jobs representing combined total six years across three leading high-performance computing platforms. mined these to analyze all their runs on platform; evolution an application's time, and platforms; platform's entire workload. Our analysis techniques can help developers platform owners improve performance system utilization, quickly identifying underperforming offering...

10.1145/2749246.2749269 article EN 2015-06-08

Fail-slow hardware is an under-studied failure mode. We present a study of 114 reports fail-slow incidents, collected from large-scale cluster deployments in 14 institutions. show that all types such as disk, SSD, CPU, memory, and network components can exhibit performance faults. made several important observations faults convert one form to another, the cascading root causes impacts be long, have varying symptoms. From this study, we make suggestions vendors, operators, systems designers.

10.1145/3242086 article EN ACM Transactions on Storage 2018-08-31

Computational science applications are driving a demand for increasingly powerful storage systems. While many techniques available capturing the I/O behavior of individual application trial runs and specific components system, continuous characterization production system remains daunting challenge systems with hundreds thousands compute cores multiple petabytes storage. As result, these often designed without clear understanding diverse computational workloads they will support.In this...

10.1109/msst.2011.5937212 article EN 2011-05-01

MPI is the most prominent programming model used in scientific computing today. Despite importance of MPI, however, how applications use it production not well understood. This lack understanding attributed primarily to fact that systems are often wary incorporating automatic profiling tools perform such analysis because concerns about potential performance over-heads. In this study, we a lightweight tool, called Autoperf, log usage characteristics on large IBM BG/Q supercomputing system...

10.1109/sc.2018.00033 article EN 2018-11-01

The increasing complexity of HPC systems has introduced new sources variability, which can contribute to significant differences in run-to-run performance applications. With components at various levels the system contributing application developers and users are now faced with difficult task running tuning their applications an environment where measurements vary by as much a factor two three. In this study, we classify, quantify, present ways mitigate variability on Cray XC Intel Xeon Phi...

10.1145/3126908.3126926 article EN 2017-11-08

Contemporary high-performance computing (HPC) applications encompass a broad range of distinct I/O strategies and are often executed on number different compute platforms in their lifetime. These large-scale HPC employ increasingly complex subsystems to provide suitable level performance applications. Tuning workloads for such system is nontrivial, the results generally not portable other systems. profiling tools can help address this challenge, but most existing only instrument specific...

10.1109/espt.2016.006 article EN 2016-11-01

In recent years, half precision floating-point arithmetic has gained wide support in hardware and software stack thanks to the advance of artificial intelligence machine learning applications. Operating at can significantly reduce memory footprint comparing operating single or double precision. For bound applications such as time domain wave simulations, this is an attractive feature. However, narrower width data format lead degradation solution quality due larger roundoff errors. work, we...

10.1190/geo2024-0266.1 article EN Geophysics 2025-02-04

I/O efficiency is essential to productivity in scientific computing, especially as many domains become more data-intensive. Many characterization tools have been used elucidate specific aspects of parallel performance, but analyzing components complex subsystems isolation fails provide insight into critical questions: how do the interact, what are reasonable expectations for application and underlying causes performance problems? To address these questions while capitalizing on existing...

10.1145/3149393.3149395 article EN 2017-11-03

MPI is the most prominent programming model used in scientific computing today. Despite importance of MPI, however, how applications use it production not well understood. This lack understanding attributed primarily to fact that systems are often wary incorporating automatic profiling tools perform such analysis because concerns about potential performance over-heads. In this study, we a lightweight tool, called Autoperf, log usage characteristics on large IBM BG/Q supercomputing system...

10.5555/3291656.3291696 article EN IEEE International Conference on High Performance Computing, Data, and Analytics 2018-11-11

In this paper, we propose an approach to improving the I/O performance of IBM Blue Gene/Q supercomputing system using a novel framework that can be integrated into high applications. We take advantage system's tremendous computing resources and interconnection bandwidth among compute nodes efficiently exploit bandwidth. This focuses on lossless data compression, topology-aware movement, subfiling. The efficacy solution is demonstrated microbenchmarks application-level benchmark.

10.1109/pdp.2014.60 article EN 2014-02-01

Contemporary high-performance computing (HPC) applications encompass a broad range of distinct I/O strategies and are often executed on number different compute platforms in their lifetime. These large-scale HPC employ increasingly complex subsystems to provide suitable level performance applications. Tuning workloads for such system is nontrivial, the results generally not portable other systems. profiling tools can help address this challenge, but most existing only instrument specific...

10.5555/3018823.3018825 article EN 2016-11-13

In preparation for the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report, climate community will run Coupled Model Intercomparison Project phase 5 (CMIP-5) experiments, which are designed to answer crucial questions about future regional change and results of carbon feedback different mitigation scenarios. The CMIP-5 experiments generate petabytes data that must be replicated seamlessly, reliably, quickly hundreds research teams around globe. As an end-to-end test...

10.1145/1851476.1851519 article EN 2010-06-21

Application performance variability caused by network contention is a major issue on dragonfly based systems. This work-in-progress study makes two contributions. First, we analyze real workload logs and conduct application experiments the production system Theta at Argonne to evaluate variability. We find strong correlation between utilization where high (e.g., above 95%) can cause up 21% degradation in performance. Next, driven this key finding, investigate scheduling policy mitigate...

10.1145/3322789.3328743 article EN 2019-06-17

High Performance Computing (HPC) is an important method for scientific discovery via large-scale simulation, data analysis, or artificial intelligence. Leadership-class supercomputers are expensive, but essential to run large HPC applications. The Petascale era of began in 2008, with the first machines achieving performance excess one petaflops, and advent new 2021 (e.g., Aurora, Frontier), Exascale will soon begin. However, high theoretical computing capability (i.e., peak FLOPS) a machine...

10.1145/3392717.3392774 article EN 2020-06-29

Summary In order to provide a stepping stone from the Argonne Leadership Computing Facility's (ALCF) world class production 10 petaFLOP IBM BlueGene/Q system, Mira, its next generation 200 petaFLOPS 3rd Intel Xeon Phi Aurora, ALCF worked with and Cray acquire an 8.6 2nd Phi–based system named Theta. Theta was delivered, installed, integrated, accepted on aggressive schedule in just over 3 months. We will detail how we were able successfully meet deadline as well lessons learned during process.

10.1002/cpe.4336 article EN Concurrency and Computation Practice and Experience 2017-09-26

Growing evidence in the scientific computing community indicates that parallel file systems are not sufficient for all HPC storage workloads. This realization has motivated extensive research new system designs. The question of which design we should turn to implies there could be a single answer satisfying wide range diverse applications. We argue such generic solution does exist. Instead, custom data services designed and tailored needs specific applications on hardware. Furthermore, close...

10.1109/pdsw-discs.2018.00013 article EN 2018-11-01

A closed-cycle gasoline compression ignition (GCI) engine simulation near top dead center (TDC) was used to profile the performance of a parallel commercial computational fluid dynamics (CFD) code, as it scaled on up 4096 cores an IBM Blue Gene/Q (BG/Q) supercomputer. The test case has 9 × 106 cells TDC, with fixed mesh size 0.15 mm, and run configurations ranging from 128 cores. Profiling done for small duration 0.11 crank angle degrees TDC during ignition. Optimization input/output (I/O)...

10.1115/1.4032623 article EN Journal of Energy Resources Technology 2016-01-29

High-performance computing (HPC) and distributed systems rely on a diverse collection of system soft-ware to provide application services, including file systems, schedulers, web services. Such software services must manage highly concurrent requests, interact with wide range resources, scale well in order be successful. Unfortunately, no single programming model for currently offers optimal performance productivity all these tasks. While numerous libraries, languages, language extensions...

10.1109/nas.2012.41 article EN 2012-06-01
Coming Soon ...