- Parallel Computing and Optimization Techniques
- Advanced Data Storage Technologies
- Distributed and Parallel Computing Systems
- Cloud Computing and Resource Management
- Software System Performance and Reliability
- Scientific Computing and Data Management
- Distributed systems and fault tolerance
- Magnetic confinement fusion research
- Physics of Superconductivity and Magnetism
- Gamma-ray bursts and supernovae
- Data Mining Algorithms and Applications
- Anomaly Detection Techniques and Applications
- Advanced Database Systems and Queries
- Seismic Imaging and Inversion Techniques
- Solar and Space Plasma Dynamics
- Neural Networks and Applications
- Underwater Vehicles and Communication Systems
- Particle accelerators and beam dynamics
- Image and Object Detection Techniques
- Research Data Management Practices
- Peer-to-Peer Network Technologies
- Graph Theory and Algorithms
- Video Analysis and Summarization
- Oceanographic and Atmospheric Processes
- Superconducting Materials and Applications
University of Oregon
2016-2025
Oregon Research Institute
2019-2022
University of Arizona
2022
Universitat Politècnica de Catalunya
2010-2019
Barcelona Supercomputing Center
2010-2019
Lawrence Livermore National Laboratory
2019
ParaTools (United States)
2012
We present ADIOS 2, the latest version of Adaptable Input Output (I/O) System. 2 addresses scientific data management needs ranging from scalable I/O in supercomputers, to analysis personal computer and cloud systems. Version introduces a unified application programming interface (API) that enables seamless movement through files, wide-area-networks, direct memory access, as well high-level APIs for analysis. The internal architecture provides set reusable extendable components managing...
The new challenges presented by exascale system architectures have resulted in difficulty achieving the desired scalability using traditional distributed-memory runtimes.Asynchronous many-task systems (AMT) are based on a paradigm showing promise addressing these challenges, providing application developers with productive and performant approach to programming next generation systems.HPX is C++ Library for concurrency parallelism that developed STE||AR Group, an international group of...
Parallel applications running on high-end computer systems manifest a complexity of performance phenomena. Tools to observe parallel attempt capture these phenomena in measurement datasets rich with information relating multiple metrics execution dynamics and parameters specific the application-system experiment. However, potential size need assimilate results from experiments makes it daunting challenge not only process information, but discover understand insights. In this paper, we...
Empirical performance evaluation of parallel systems and applications can generate significant amounts data analysis results from multiple experiments as is investigated problems diagnosed. Hence, the management information a core component tools. To better support tool integration, portability; reuse, there strong motivation to develop technology that provide common foundation for storage, access, merging, analysis. This paper presents design implementation framework (PerfDMF). PerfDMF...
Exascale systems will require new approaches to performance observation, analysis, and runtime decision-making optimize for efficiency. The standard "first-person" model, in which multiple operating system processes threads observe themselves record first-person profiles or traces offline is not adequate capture interactions at shared resources highly concurrent, dynamic systems. Further, it does support mechanisms adaptation. Our approach, called APEX (Autonomic Performance Environment...
Summary With the rise of exascale systems and large, data‐centric workflows, need to observe analyze high performance computing (HPC) applications during their execution is becoming increasingly important. HPC are typically not designed with online monitoring in mind, therefore, observability challenge lies being able access interesting events low overhead while seamlessly integrating such capabilities into existing new applications. We explore how our service‐based observation, monitoring,...
With the growing computational complexity of science and new emerging hardware, it is time to re-evaluate traditional monolithic design codes. One paradigm constructing larger scientific experiments from coupling multiple individual applications, each targeting their own physics, characteristic lengths, and/or scales. We present a framework constructed by leveraging capabilities such as in-memory communications, workflow scheduling on HPC resources, continuous performance monitoring. This...
We present a highly scalable demonstration of portable asynchronous many-task programming model and runtime system applied to grid-based adaptive mesh refinement hydrodynamic simulation double white dwarf merger with 14 levels that spans 17 orders magnitude in astrophysical densities. The code uses the C++ parallel is embodied HPX library being incorporated into ISO standard. represents significant shift from existing bulk synchronous models under consideration for exascale systems. Through...
We present the Exascale Framework for High Fidelity coupled Simulations (EFFIS), a workflow and code coupling framework developed as part of Whole Device Modeling Application (WDMApp) in Computing Project. EFFIS consists library, command line utilities, collection run-time daemons. Together, these software products enable users to easily compose execute workflows that include: strong or weak coupling, situ (or offline) analysis/visualization/monitoring, command-and-control actions, remote...
The increasing availability of machines relying on non-GPU architectures, such as ARM A64FX in high-performance computing, provides a set interesting challenges to application developers. In addition requiring code portability across different parallelization schemes, programs targeting these architectures have be highly adaptable terms compute kernel sizes accommodate execution characteristics for various heterogeneous workloads. this paper, we demonstrate an approach and performance that...
Power is the most critical resource for exascale high performance computing. In future, system administrators might have to pay attention power consumption of machine under different work loads. Hence, each application may run with an allocated budget. Thus, achieving best on future machines requires optimal subject a constraint. This additional requirement should not be responsibility HPC~(High Performance Computing) developers. Optimizing given budget high-performance software stack....
We study the simulation of stellar mergers, which requires complex simulations with high computational demands. have developed Octo-Tiger, a finite volume grid-based hydrodynamics code Adaptive Mesh Refinement is unique in conserving both linear and angular momentum to machine precision. To face challenge increasingly complex, diverse, heterogeneous HPC systems, Octo-Tiger relies on high-level programming abstractions. use HPX its futurization capabilities ensure scalability between nodes...
Characterizing the performance of scientific applications is essential for effective code optimization, both by compilers and high-level adaptive numerical algorithms. While maximizing power efficiency becoming increasingly important in current high-performance architectures, little or no hardware software support exists detailed measurements. Hardware counter-based models are a promising method guiding software-based techniques reducing power. We present component-based infrastructure...
Modern parallel performance measurement systems collect information either through probes inserted in the application code or via statistical sampling. Probe-based techniques measure metrics directly using calls to a library that execute as part of application. In contrast, sampling-based interrupt program execution sample for analysis performance. Although both approaches are represented by robust tool frameworks community, each has its strengths and weaknesses. this paper, we investigate...
Extreme-scale computing requires a new perspective on the role of performance observation in Exascale system software stack. Because anticipated high concurrency and dynamic operation these systems, it is no longer reasonable to expect that post-mortem measurement analysis methodology will suffice. Rather, there strong need for merges first-and third-person observation, situ analysis, introspection across stack layers serves online feedback adaptation. In this paper we describe DOE-funded...
A growing disparity between supercomputer computation speeds and I/O rates means that it is rapidly becoming infeasible to analyze application output only after has been written a file system. Instead, data-generating applications must run concurrently with data reduction and/or analysis operations, which they exchange information via high-speed methods such as interprocess communications. The resulting parallel computing motif, online (ODAR), important implications for both HPC systems...
The integration of scalable performance analysis in parallel development tools is difficult. potential size data sets and the need to compare results from multiple experiments presents a challenge manage process information. Simply characterize applications running on potentially hundreds thousands processor cores requires new techniques. Furthermore, many exploratory processes are repeatable could be automated, but now implemented as manual procedures. In this paper, we will discuss current...
Automating the process of parallel performance experimentation, analysis, and problem diagnosis can enhance environments for performance-directed application development, compilation, execution. This is especially true when parametric studies, modeling, optimization strategies require large amounts data to be collected processed knowledge synthesis reuse. paper describes integration PerfExplorer mining framework with OpenUH compiler infrastructure. provides auto-instrumentation source code...
As access to supercomputing resources is becoming more and commonplace, performance analysis tools are gaining importance in order decrease the gap between application supercomputers' peak performance. Performance allow analyst understand idiosyncrasies of an improve it. However, these require monitoring regions provide information analysts, leaving non-monitored code unknown, which may result lack understanding important application. In this paper we describe automated methodology that...
Due to the sheer volume of data it is typically impractical analyze detailed performance an HPC application running at-scale. While conventional small-scale benchmarking and scaling studies are often sufficient for simple applications, many modern workflow-based applications couple multiple elements with competing resource demands complex inter-communication patterns which cannot easily be studied in isolation at small scale. This work discusses Chimbuko, a analysis framework that provides...
SOS is a new model for the online in situ characterization and analysis of complex high-performance computing applications. employs data framework with distributed information management structured query access capabilities. The primary design objectives are flexibility, scalability, programmability. provides complete that can be configured used directly by an application, allowing detailed workflow scientific This paper describes experiments to validate explore performance characteristics...