- Parallel Computing and Optimization Techniques
- Distributed and Parallel Computing Systems
- Advanced Data Storage Technologies
- Advanced Chemical Physics Studies
- Advanced NMR Techniques and Applications
- Spectroscopy and Quantum Chemical Studies
- Distributed systems and fault tolerance
- Cloud Computing and Resource Management
- Tensor decomposition and applications
- Physics of Superconductivity and Magnetism
- Interconnection Networks and Systems
- Scientific Computing and Data Management
- Machine Learning in Materials Science
- Algorithms and Data Compression
- Quantum and electron transport phenomena
- Embedded Systems Design Techniques
- Matrix Theory and Algorithms
- Atmospheric Ozone and Climate
- Quantum, superfluid, helium dynamics
- Free Radicals and Antioxidants
- Nanopore and Nanochannel Transport Studies
- Radiation Effects in Electronics
- Quantum Computing Algorithms and Architecture
- Protein Structure and Dynamics
- Theoretical and Computational Physics
Nvidia (United States)
2022-2023
Institute of Technology of Cambodia
2022
University of Delaware
2022
University of Basel
2022
Swisscom (Switzerland)
2022
CSCS - Swiss National Supercomputing Centre
2022
Sandia National Laboratories
2022
Helmholtz-Zentrum Dresden-Rossendorf
2022
University of Illinois Urbana-Champaign
2022
Intel (United Kingdom)
2015-2020
Specialized computational chemistry packages have permanently reshaped the landscape of chemical and materials science by providing tools to support guide experimental efforts for prediction atomistic electronic properties. In this regard, structure played a special role using first-principle-driven methodologies model complex processes. Over past few decades, rapid development computing technologies tremendous increase in power offered unique chance study transformations sophisticated...
Parallelizing dense matrix computations to distributed memory architectures is a well-studied subject and generally considered be among the best understood domains of parallel computing. Two packages, developed in mid 1990s, still enjoy regular use: ScaLAPACK PLAPACK. With advent many-core architectures, which may very well take shape within single processor, these packages must revisited since traditional MPI-based approaches will likely need extended. Thus, this good time review lessons...
BLIS is a new framework for rapid instantiation of the BLAS. We describe how extends "GotoBLAS approach" to implementing matrix multiplication (GEMM). While GEMM was previously implemented as three loops around an inner kernel, exposes two additional within that casting computation in terms micro-kernel so porting G E M becomes matter customizing this given architecture. discuss facilitates finer level parallelism greatly simplifies multithreading well opportunities parallelizing multiple...
Cyclops (cyclic-operations) Tensor Framework (CTF) 1 is a distributed library for tensor contractions. CTF aims to scale high-dimensional contractions such as those required in the Coupled Cluster (CC) electronic structure method massively-parallel supercomputers. The framework preserves by subdividing tensors cyclically, producing regular parallel decomposition. An internal virtualization layer provides completely general mapping support while maintaining ideal load balance. decides on best...
MADNESS (multiresolution adaptive numerical environment for scientific simulation) is a high-level software solving integral and differential equations in many dimensions that uses fast harmonic analysis methods with guaranteed precision are based on multiresolution separated representations. Underpinning the capabilities powerful petascale parallel programming aims to increase both programmer productivity code scalability. This paper describes features of briefly discusses some current...
The coupled cluster (CC) ansatz is generally recognized as providing one of the best wave function-based descriptions electronic correlation in small- and medium-sized molecules. fact that CC equations with double excitations (CCD) may be expressed a handful dense matrix-matrix multiplications makes it an ideal method to ported graphics processing units (GPUs). We present our implementation spin-free CCD which entire iterative procedure evaluated on GPU. GPU-accelerated algorithm readily...
The thermochemistry of the conversion glucose to levulinic acid through fructofuranosyl intermediates is investigated using high-level ab initio methods G4 and G4MP2. calculated gas phase reaction enthalpies indicate that first two steps involving water molecule elimination are highly endothermic, while other steps, including additional rehydration form acid, exothermic. free energies inclusion entropic effects makes dehydration more favorable, although still endothermic. Elevated...
Long-range dispersion interactions have a critical influence on physical quantities in simulations of inhomogeneous systems. However, the perceived computational overhead long-range solvers has until recently discouraged their implementation molecular dynamics packages. Here, we demonstrate that reducing cutoff radius for local introduced particle-particle particle-mesh (PPPM) method [Isele-Holder et al., J. Chem. Phys., 2012, 137, 174107] can actually often be faster than truncating...
DFT calculations have been performed with the B3LYP and MPW1K functional on hydrogen atom abstraction reactions of ethenoxyl ethenol phenoxyl both phenol alpha-naphthol. Comparison results G3 shows that seriously underestimates barrier heights for reaction by proton-coupled electron transfer (PCET) (HAT) mechanisms. The also heights, but much less than B3LYP. Similarly, comparison experiments radical alpha-naphthol indicates height preferred PCET mechanism is calculated more accurately These...
The static dipole polarizabilities of water clusters (2≤N≤12) are determined at the coupled-cluster level theory (CCSD). For polarizability monomer it was that role basis set is more important than electron correlation and augmentation converges with two sets diffuse functions. CCSD results used to benchmark a variety density functionals while performance several families (Dunning, Pople, Sadlej) in producing accurate values for also examined. Sadlej family found produce when compared ones...
Static hyperpolarizabilities of molecules (water, acetonitrile, chloroform, and para-nitroaniline) are calculated with large basis sets using coupled-cluster response theory compared to four common density functional methods. These results reveal which methods appropriate for nonlinear optical studies different types provide a means estimating errors from the quantum chemical approximation when including vibrational contributions or solvent effects at QM/MM level. The largest calculation...
The lattice Boltzmann method is increasingly important in facilitating large-scale fluid dynamics simulations. To date, these simulations have been built on discretized velocity models of up to 27 neighbors. Recent work has shown that higher order approximations the continuum equation enable not only recovery Navier-Stokes hydrodynamics, but also for a wider range Knudsen numbers, which especially micro- and nanoscale flows. These higher-order significant impact both communication...
This paper summarizes developments in the NWChem computational chemistry suite since last major release (NWChem 7.0.0). Specifically, we focus on functionality, along with input blocks, that is accessible current stable 7.2.0) and "master" development branch, interfaces to quantum computing simulators, external libraries, github repository, containerization of executable images. Some ongoing will be available near future are also discussed.
We report cutting edge performance results on a single node hybrid CPU-multi-GPU implementation of the spin adapted
The completely renormalized equation-of-motion coupled-cluster approach with singles, doubles, and noniterative triples [CR-EOMCCSD(T)] has proven to be a reliable tool in describing vertical excitation energies small medium size molecules. In order reduce the high numerical cost of genuine CR-EOMCCSD(T) method make approaches applicable large molecular systems, two active-space variants this formalism [the CR-EOMCCSd(t)-II CR-EOMCCSd(t)-III methods], based on different choices subspace...
In this paper we present "Casper," a process-based asynchronous progress solution for MPI one-sided communication on multi- and many-core architectures. Casper uses transparent call redirection through PMPI MPI-3 shared-memory windows to map memory from multiple user processes into the address space of one or more ghost processes, thus allowing where needed while native hardware-based available. Unlike traditional thread- interrupt-based models, provides capability dedicate an arbitrary...
In this paper, we present the recent advances in computation of Dirac–Kohn–Sham (DKS) method BERTHA code. We show here that simple underlined structure FORTRAN code also favors efficient porting to GPUs, leading a particularly hybrid CPU/GPU implementation (OpenMP/OpenACC), where most computationally intensive part for DKS matrix evaluation (three-center two-electron integrals evaluated via McMurchie–Davidson scheme) is efficiently offloaded GPU compiler directives based on OpenACC...
We present an efficient orbital optimization procedure that combines the highly GPU accelerated, spin-adapted density matrix renormalization group (DMRG) method with complete active space self-consistent field (CAS-SCF) approach for quantum chemistry implemented in ORCA program package. Leveraging computational power of latest generation Nvidia hardware, we perform CAS-SCF based optimizations unprecedented CAS sizes up to 82 electrons orbitals [CAS(82,82)] molecular systems comprising spaces...
Coupled-cluster theory with single and double excitations is applied to the calculation of optical properties large polyaromatic hydrocarbons. Dipole polarizabilities are reported for benzene, pyrene, oligoacenes sequence n=2–6. Dynamic were calculated on polyacences as pentacene a frequency benzene pyrene at many frequencies. The basis set effect was studied using variety sets in Pople [Theor. Chim. Acta 28, 213 (1973)] Dunning [J. Chem. Phys. 90, 1007 (1989)] families up aug-cc-pVQZ Sadlej...
The industry-standard Message Passing Interface (MPI) provides one-sided communication functionality and is available on virtually every parallel computing system. However, it believed that MPI's model not rich enough to support higher-level global address space programming models. We present the first successful application of MPI as a runtime system for PGAS model, Global Arrays (GA). This work has an immediate impact users GA applications, such NW Chem, who often must wait several months...