- Parallel Computing and Optimization Techniques
- Distributed and Parallel Computing Systems
- Advanced Data Storage Technologies
- Software System Performance and Reliability
- Distributed systems and fault tolerance
- Cloud Computing and Resource Management
- Scientific Computing and Data Management
- Embedded Systems Design Techniques
- Simulation Techniques and Applications
- Magnetic confinement fusion research
- Interconnection Networks and Systems
- Semiconductor materials and devices
- Radiation Effects in Electronics
- Ionosphere and magnetosphere dynamics
- Superconducting Materials and Applications
- Computational Physics and Python Applications
- Real-Time Systems Scheduling
- Combustion and flame dynamics
- Teaching and Learning Programming
- Advanced NMR Techniques and Applications
- Data Mining Algorithms and Applications
- Computational Fluid Dynamics and Aerodynamics
- Advanced Combustion Engine Technologies
- Software Reliability and Analysis Research
- Semiconductor Lasers and Optical Devices
University of Oregon
2013-2025
ParaTools (United States)
2011-2022
Lawrence Berkeley National Laboratory
2020-2022
Lawrence Livermore National Laboratory
2006-2020
Los Alamos National Laboratory
1999-2020
Sandia National Laboratories
2020
New Mexico Consortium
2020
University of New Mexico
2020
Red Hat (United States)
2020
National Energy Research Scientific Computing Center
2020
The ability of performance technology to keep pace with the growing complexity parallel and distributed systems depends on robust frameworks that can at once provide system-specific capabilities support high-level problem solving. Flexibility portability in empirical methods processes are influenced primarily by strategies available for instrmentation measurement, how effectively they integrated composed. This paper presents TAU (Tuning Analysis Utilities) sytem describe it addresses diverse...
Computational science is paramount to the understanding of underlying processes in internal combustion engines future that will utilize non-petroleum-based alternative fuels, including carbon-neutral biofuels, and burn new regimes attain high efficiency while minimizing emissions particulates nitrogen oxides. Next-generation likely operate at higher pressures, with greater amounts dilution fuels exhibit a wide range chemical physical properties. Therefore, there significant role for...
The Common Component Architecture (CCA) provides a means for software developers to manage the complexity of large-scale scientific simulations and move toward plug-and-play environment high-performance coputing. In computing context, component models also promote collaboration using independently developed software, thereby allowing particular individals or groups focus on aspects greatest interest them. CCA supports parallel distributed coputing as well local connections between components...
The power of GPUs is giving rise to heterogeneous parallel computing, with new demands on programming environments, runtime systems, and tools deliver high-performing applications. This paper studies the problems associated performance measurement machines GPUs. A computation model alternative host-GPU approaches are discussed set stage for reporting capabilities in three leading HPC tools: PAPI, Vampir, TAU Performance System. Our work leverages CUPTI tool support NVIDIA's CUDA device...
Article Free Access Share on Portable profiling and tracing for parallel, scientific applications using C++ Authors: Sameer Shende Department of Computer Information Science, University Oregon, Eugene, OR ORView Profile , Allen D. Malony Janice Cuny Peter Beckman Advanced Computing Laboratory, Los Alamos National Alamos, NM NMView Steve Karmesin Kathleen Lindlan Authors Info & Claims SPDT '98: Proceedings the SIGMETRICS symposium Parallel distributed toolsAugust 1998 Pages...
The developers of high-performance scientific applications often work in complex computing environments that place heavy demands on program analysis tools. need tools interoperate, are portable across machine architectures, and provide source-level feedback. In this paper, we describe a tool framework, the Program Database Toolkit (PDT), supports development meeting these requirements. PDT uses compile-time information to create complete database high-level is structured for well-defined...
The TAU Performance System ® is a portable profiling and tracing toolkit for performance analysis of parallel programs written in Fortran, C, C++, UPC, Java, Python. (Tuning Analysis Utilities) capable gathering information through instrumentation functions, methods, basic blocks, statements as well event-based sampling. All C++ language features are supported including templates namespaces. API also provides selection groups organizing controlling instrumentation. can be inserted the source...
Electronic structure calculations are a widely used tool in materials science and large consumer of supercomputing resources. Traditionally, the software packages for these kind simulations have been implemented compiled languages, where Fortran its different versions has most popular choice. While dynamic, interpreted such as Python, can increase effciency programmer, they cannot compete directly with raw performance languages. However, by using an language together language, it is possible...
The use of global address space languages and one-sided communication for complex applications is gaining attention in the parallel computing community. However, lack good evaluative methods to observe multiple levels performance makes it difficult isolate cause deficiencies understand fundamental limitations system application design future improvement. NWChem a popular computational chemistry package, which depends on Global Arrays/Aggregate Remote Memory Copy Interface suite partitioned...
The developers of high-performance scientific applications often work in complex computing environments that place heavy demands on program analysis tools. need tools interoperate, are portable across machine architectures, and provide source-level feedback. In this paper, we describe a tool framework, the Program Database Toolkit (PDT), supports development meeting these requirements. PDT uses compile-time information to create complete database high-level is structured for well-defined...
Modern parallel performance measurement systems collect information either through probes inserted in the application code or via statistical sampling. Probe-based techniques measure metrics directly using calls to a library that execute as part of application. In contrast, sampling-based interrupt program execution sample for analysis performance. Although both approaches are represented by robust tool frameworks community, each has its strengths and weaknesses. this paper, we investigate...
Extreme-scale computing requires a new perspective on the role of performance observation in Exascale system software stack. Because anticipated high concurrency and dynamic operation these systems, it is no longer reasonable to expect that post-mortem measurement analysis methodology will suffice. Rather, there strong need for merges first-and third-person observation, situ analysis, introspection across stack layers serves online feedback adaptation. In this paper we describe DOE-funded...
Article SMARTS: exploiting temporal locality and parallelism through vertical execution Share on Authors: Suvas Vajracharya Los Alamos National Laboratory, Alamos, NM NMView Profile , Steve Karmesin Peter Beckman James Crotinger Allen Malony Dept. of Computer Information Science, University Oregon Sameer Shende Rod Oldehoeft Stephen Smith Authors Info & Claims ICS '99: Proceedings the 13th international conference SupercomputingJune 1999 Pages...