- Parallel Computing and Optimization Techniques
- Advanced Data Storage Technologies
- Distributed and Parallel Computing Systems
- Cloud Computing and Resource Management
- Interconnection Networks and Systems
- Embedded Systems Design Techniques
- Scientific Computing and Data Management
- Distributed systems and fault tolerance
- Spectroscopy and Quantum Chemical Studies
- Advanced Chemical Physics Studies
- Synthesis and biological activity
- Geological and Geophysical Studies
- Synthesis and Characterization of Heterocyclic Compounds
- Caching and Content Delivery
- Semiconductor materials and devices
- Advanced Memory and Neural Computing
- Estrogen and related hormone effects
- Human Mobility and Location-Based Analysis
- Computational Drug Discovery Methods
- Software-Defined Networks and 5G
- Tunneling and Rock Mechanics
- Green IT and Sustainability
- User Authentication and Security Systems
- Quantum Information and Cryptography
- Quantum, superfluid, helium dynamics
Lawrence Berkeley National Laboratory
2015-2024
National Energy Research Scientific Computing Center
2010-2024
Protein Express (United States)
2024
Advanced Technologies Group (United States)
2024
Frederick National Laboratory for Cancer Research
2024
Indiana University
2022
Swisscom (Switzerland)
2022
ETH Zurich
2022
Institut national de recherche en informatique et en automatique
2022
Barcelona Supercomputing Center
2021
Cloud computing has seen tremendous growth, particularly for commercial web applications. The on-demand, pay-as-you-go model creates a flexible and cost-effective means to access compute resources. For these reasons, the scientific community shown increasing interest in exploring cloud computing. However, underlying implementation performance of clouds are very different from those at traditional supercomputing centers. It is therefore critical evaluate HPC applications today's environments...
To achieve exascale computing, fundamental hardware architectures must change. This will significantly impact scientific applications that run on current high performance computing (HPC) systems, many of which codify years domain knowledge and refinements for contemporary computer systems. adapt to architectures, developers be able reason about new determine what programming models algorithms provide the best blend energy efficiency in future. An abstract machine model is designed expose...
Interconnection networks are a critical resource for large supercomputers. The dragonfly topology, which provides low network diameter and bisection bandwidth, is being explored as promising option building multi-Petaflop's Exaflop's systems. Unlike the extensively studied torus networks, best choices of message routing job placement strategies topology not well understood. This paper aims at analyzing behavior machine built using various strategies, policies, application communication...
Contemporary high-performance computing (HPC) applications encompass a broad range of distinct I/O strategies and are often executed on number different compute platforms in their lifetime. These large-scale HPC employ increasingly complex subsystems to provide suitable level performance applications. Tuning workloads for such system is nontrivial, the results generally not portable other systems. profiling tools can help address this challenge, but most existing only instrument specific...
Potential energy surface points computed from variants of density functional theory (DFT) are used to calculate directly the anharmonic vibrational frequencies H2O, Cl−H2O, and (H2O)2. The method is an adaptation DFT a recent algorithm for direct calculations using ab initio electronic structure codes. performed BLYP B3LYP functionals results compared with experiment, also those calculated potential obtained Möller-Plesset second–order perturbation (MP2). calculation states...
Text-based password systems are the authentication mechanism most commonly used on computer systems. Graphical passwords have recently been proposed because pictorial-superiority effect suggests that people better memory for images. The widely advocated graphical based recognition rather than recall. This approach is favored a more effective manner of retrieval recall, exhibiting greater accuracy and longevity material. However, schemes such as these combine both use images mechanism. paper...
Parallel I/O is fast becoming a bottleneck to the research agendas of many users extreme scale parallel computers. The principle cause this concurrency explosion high-end computation, coupled with complexity providing file systems that perform reliably at such scales. More than just being bottleneck, performance notoriously variable, influenced by numerous factors inside and outside application, thus making it extremely difficult isolate effect for events. In paper, we propose statistical...
I/O performance is a critical aspect of data-intensive scientific computing. We seek to advance the state practice in understanding and diagnosing issues through investigation comprehensive data set that captures full year production storage activity at two leadership-scale computing facilities. demonstrate techniques identify regions interest, perform focused investigations both long-term trends transient anomalies, uncover contributing factors lead fluctuation. find life parallel file...
The forward–backward semiclassical dynamics (FBSD) scheme for obtaining time correlation functions shows much promise as a method including quantum mechanical effects into the calculation of dynamical properties condensed phase systems. By combining this with discretized path integral representation Boltzmann operator one is able to calculate at finite temperature. In work we develop constant temperature molecular techniques sampling space and variables. resulting methodology applied...
Network congestion is one of the biggest problems facing HPC systems today, affecting system throughput, performance, user experience, and reproducibility. Congestion manifests as run-to-run variability due to contention for shared resources (e.g., filesystems) or routes between compute endpoints. Despite its significance, current network benchmarks fail proxy real-world utilization seen on congested systems. We propose a new open-source benchmark suite called Global Performance Tests...
The Weather Research and Forecast (WRF) model is a limited-area of the atmosphere for mesoscale research operational numerical weather prediction (NWP). A petascale problem WRF nature run that provides very high-resolution "truth" against which more coarse simulations or perturbation runs may be compared purposes studying predictability, stochastic parameterization, fundamental dynamics. We carried out involving an idealized high resolution rotating fluid on hemisphere to investigate scales...
As supercomputers are being built from an ever increasing number of processing elements, the effort required to achieve a substantial fraction system peak performance is continuously growing. Tools needed that give developers and computing center staff holistic indicators about resource consumption applications potential pitfalls at scale. To use full supercomputer today, must incorporate multilevel parallelism (threading message passing) carefully orchestrate file I/O. consequence, tools...
Scientists are increasingly considering cloud computing platforms to satisfy their computational needs. Previous work has shown that virtualized environments can have significant performance impact. However there is still a limited understanding of the nature overheads and type applications might do well in these environments. In this paper we detail benchmarking results characterize virtualization overhead its impact on performance. We also examine various interconnect technologies with...
I/O efficiency is essential to productivity in scientific computing, especially as many domains become more data-intensive. Many characterization tools have been used elucidate specific aspects of parallel performance, but analyzing components complex subsystems isolation fails provide insight into critical questions: how do the interact, what are reasonable expectations for application and underlying causes performance problems? To address these questions while capitalizing on existing...
Hardware specialization is a promising direction for the future of digital computing. Reconfigurable technologies enable hardware with modest non-recurring engineering cost. In this paper, we use FPGAs to evaluate benefits building specialized numerical kernels found in scientific applications. order properly performance, not only compare Intel Arria 10 and Xilinx U280 performance against Xeon, Xeon Phi, NVIDIA V100 GPUs, but also extend Empirical Roofline Toolkit (ERT) assess our results...
Energy efficiency will be important in future accelerator-based HPC systems for sustainability and to improve overall performance. This study proposes a deep neural network (DNN)-based learning model execution time power consumption of workloads across GPUs DVFS design space. Micro-architectural data obtained by running SPEC-ACCEL, DGEMM, STREAM benchmarks are used training. These features consistent workload unaffected frequency input size reducing the required significantly. For real-world...
Summary NERSC's newest system, Perlmutter, features a 35 PB all‐flash Lustre file system built on HPE Cray ClusterStor E1000. We present its architecture, early performance figures, and considerations unique to this architecture. demonstrate the of E1000 OSSes through low‐level tests that achieve over 90% theoretical bandwidth SSDs at OST LNet levels. also show end‐to‐end for both traditional dimensions I/O (peak bulk‐synchronous bandwidth) nonoptimal workloads endemic production computing...
The following topics are dealt with: parallel processing; message passing; machines; application program interfaces; multiprocessing systems; resource allocation; scheduling; cache storage; data analysis; graphics processing units.
Abstract Hardware specialization is a promising direction for the future of digital computing. Reconfigurable technologies enable hardware with modest non‐recurring engineering cost, but their performance and energy efficiency compared to state‐of‐the‐art processor architectures remain an open question. In this article, we use FPGAs evaluate benefits building specialized numerical kernels found in scientific applications. order properly performance, not only compare Intel Arria 10 Xilinx...
A method is presented for modeling application performance on parallel computers in terms of the microkernels from HPC Challenge benchmarks. Specifically, run time expressed as a linear combination inverse speeds and latencies or system characteristics. The model parameters are obtained by an automated series least squares fits using backward elimination to ensure statistical significance. If necessary, outliers deleted that final fit robust. Typically three four appear each model: at most...
Contemporary high-performance computing (HPC) applications encompass a broad range of distinct I/O strategies and are often executed on number different compute platforms in their lifetime. These large-scale HPC employ increasingly complex subsystems to provide suitable level performance applications. Tuning workloads for such system is nontrivial, the results generally not portable other systems. profiling tools can help address this challenge, but most existing only instrument specific...