- Advanced Chemical Physics Studies
- Parallel Computing and Optimization Techniques
- Advanced Data Storage Technologies
- Distributed and Parallel Computing Systems
- Cloud Computing and Resource Management
- Catalysis and Oxidation Reactions
- Spectroscopy and Quantum Chemical Studies
- Particle accelerators and beam dynamics
- Particle Accelerators and Free-Electron Lasers
- Quantum, superfluid, helium dynamics
- Machine Learning in Materials Science
- Interconnection Networks and Systems
- Embedded Systems Design Techniques
- Advanced X-ray Imaging Techniques
- Quantum Computing Algorithms and Architecture
- Physics of Superconductivity and Magnetism
- Semiconductor materials and devices
- Advanced Thermodynamics and Statistical Mechanics
- Scientific Computing and Data Management
- Business Process Modeling and Analysis
- Galaxies: Formation, Evolution, Phenomena
- Neutrino Physics Research
- Molecular spectroscopy and chirality
- Phase Equilibria and Thermodynamics
- Scheduling and Optimization Algorithms
Lawrence Berkeley National Laboratory
2014-2024
Advanced Technologies Group (United States)
2024
National Energy Research Scientific Computing Center
2012-2024
University of California, Berkeley
2005-2014
University of California System
2011
Jilin University
2010
A summary of the technical advances that are incorporated in fourth major release Q-Chem quantum chemistry program is provided, covering approximately last seven years. These include developments density functional theory methods and algorithms, nuclear magnetic resonance (NMR) property evaluation, coupled cluster perturbation theories, for electronically excited open-shell species, tools treating extended environments, algorithms walking on potential surfaces, analysis tools, energy...
Modern scientific datasets present numerous data management and analysis challenges. State-of-the-art index query technologies are critical for facilitating interactive exploration of large datasets, but challenges remain in terms designing a system processing general datasets. The needs to be able run on distributed multi-core platforms, efficiently utilize underlying I/O infrastructure, scale massive
Network congestion is one of the biggest problems facing HPC systems today, affecting system throughput, performance, user experience, and reproducibility. Congestion manifests as run-to-run variability due to contention for shared resources (e.g., filesystems) or routes between compute endpoints. Despite its significance, current network benchmarks fail proxy real-world utilization seen on congested systems. We propose a new open-source benchmark suite called Global Performance Tests...
The perfect pairing (PP) approximation from generalized valence bond theory is formulated in an unrestricted fashion for both closed- and open-shell systems using a coupled cluster ansatz. In the model chemistry proposed here, active electron pairs are correlated, but unpaired or radical electrons remain uncorrelated, leading to linear number of decoupled amplitudes which can be solved analytically. alpha beta spatial orbitals variationally optimized independently. This minimal treatment...
Summary There are many potential issues associated with deploying the Intel Xeon Phi TM (code named Knights Landing [KNL]) manycore processor in a large‐scale supercomputer. One particular is ability to fully utilize high‐speed communications network, given that serial performance of core fraction Xeon®core. In this paper, we take look at trade‐offs allocating enough cores Aries network versus dedicated computation, eg, trade‐off between MPI and OpenMP. addition, evaluate new features Cray...
Understanding patterns of application energy use is key to reaching future HPC efficiency goals. We have measured the sensitivity CPU frequency for several microbenchmarks and applications on a Cray XC30. suggest first order models performance power vs. show that these are sufficient accurately fit data. Examination resulting model shows an application's energy/frequency profiles minima only if a) change crosses architectural balance point performance-critical particular or b) significant...
High performance computing (HPC) is undergoing significant changes. The emerging HPC applications comprise both compute- and data-intensive applications. To meet the intense I/O demand from applications, burst buffers are deployed in production systems. Existing schedulers mainly CPU-centric. extreme heterogeneity of hardware devices, combined with workload changes, forces to consider multiple resources (e.g., buffers) beyond CPUs, decision making. In this study, we present a multi-resource...
Abstract A new algorithm is presented for the sparse representation and evaluation of Slater determinants in quantum Monte Carlo (QMC) method. The approach, combined with use localized orbitals a Slater‐type orbital basis set, significantly extends size molecule that can be treated QMC Application to systems containing up 390 electrons confirms cost evaluating determinant scales linearly system size. © 2005 Wiley Periodicals, Inc. J Comput Chem 26: 708–715,
Abstract The Zori 1.0 package for electronic structure computations is described. performs variational and diffusion Monte Carlo as well correlated wave function optimization. This article presents an overview of the implemented methods code capabilities. © 2005 Wiley Periodicals, Inc. J Comput Chem 26: 856–862,
The homolytic O−H bond dissociation energy (BDE) of phenol was determined from diffusion Monte Carlo (DMC) calculations using single determinant trial wave functions. DMC gives an BDE 87.0 ± 0.3 kcal/mol when restricted Hartree−Fock orbitals are used and a 87.5 with B3LYP Kohn−Sham orbitals. These results in good agreement the extrapolated B3P86 Costa Cabral Canuto (88.3 kcal/mol), recommended experimental value Borges dos Santos Martinho Simões (88.7 0.5 G3 (88.2 CBS-APNO CBS-QB3 (87.1...
We present an efficient implementation of the perfect pairing and imperfect coupled-cluster methods, as well their nuclear gradients, using resolution identity approximation to calculate two-electron integrals. The equations may be solved rapidly, making integral evaluation bottleneck step. method's efficiency is demonstrated for a series linear alkanes, which we show significant speed-ups (of approximately factor 10) with negligible error. also apply method model recently synthesized stable...
The nature of dark energy and the complete theory gravity are two central questions currently facing cosmology. A vital tool for addressing them is 3-point correlation function (3PCF), which probes deviations from a spatially random distribution galaxies. However, 3PCF's formidable computational expense has prevented its application to astronomical surveys comprising millions billions We present Galactos, high-performance implementation novel, O(N2) algorithm that uses load-balanced k-d tree...
Power is increasingly becoming a limiting factor in supercomputing. The performance and scale of future highperformance computing systems will be determined by how efficiently they manage their power budgets. Therefore any amount unused forsaken performance. Regardless the processors chosen for system, it necessary to understand variation its implications on optimization. In this paper, we identify quantify factors that affect consumption NERSC Cori supercomputer at different levels system...
When acquiring a supercomputer it is desirable to specify its performance using single number. For many procurements, this usually stated as increase over current generation platform, for example machine A provides 10 times greater than B. The determination of such number not necessarily simple process; there no universal agreement on how calculation performed and each facility uses their own method. In the future, landscape will be further complicated because systems contain heterogeneous...
HPC facilities typically use batch scheduling to space-share jobs. In this paper we revisit time-sharing using a trace of over 2.4 million jobs obtained during 20 months operation modern petascale supercomputer. Our simulations show that produces skewed distributions with much larger slowdowns for shorter-running, jobs, whereas more uniform slowdowns. Consequently, applications strong scale, the turnaround time does not scale scheduling, but it time-sharing, resulting in turnarounds are...
Extension of the conventional 2 pi structure for particle acceleration to 200 Mev or higher requires minimization power losses by optimizing cavity and drift-tube parameters. Using a previously developed method, an optimization dimensions was made at Mc energies 50, 100, 150, give minimum cylindrical tubes. Drift tubes other shapes may reduce observed high-energy due currents in surface. Ellipsoidal appear more efficient than (D.C.W.)
The hierarchical semi-separable (HSS) matrix factorization has useful characteristics for representing low-rank operators on extreme scale computing systems. To prepare the higher error rates anticipated with future architectures, this paper introduces new fault-tolerant algorithms HSS multiplication that maintain efficient performance in presence of high rates. measured runtime overhead checking and data preservation using Containment Domains library is exceptionally small encourages use...