NFDI4DS | UHH-SEMS - Publication Details

Ilya Sharapov

ORCID: 0009-0004-8980-6170

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5010934520

Research Areas

Parallel Computing and Optimization Techniques
Interconnection Networks and Systems
Advanced Data Storage Technologies
Matrix Theory and Algorithms
Distributed and Parallel Computing Systems
Advanced Neural Network Applications
Molecular Junctions and Nanostructures
Advanced Numerical Methods in Computational Mathematics
Embedded Systems Design Techniques
Machine Learning and ELM
Advanced NMR Techniques and Applications
Force Microscopy Techniques and Applications
Stochastic Gradient Optimization Techniques
Electron and X-Ray Spectroscopy Techniques
Neural Networks and Applications
Protein Structure and Dynamics
Ferroelectric and Negative Capacitance Devices
Advanced Optimization Algorithms Research
Elasticity and Material Modeling
Machine Learning in Materials Science
Distributed systems and fault tolerance
Graph theory and applications
Advanced Memory and Neural Computing
Scientific Computing and Data Management
Model Reduction and Neural Networks

Cerebras Systems (United States)
2020-2023

Intel (United States)
2013-2019

Apple (United States)
2007

Apple (Israel)
2007

Oracle (United States)
2002-2006

University of California, Los Angeles
1997

Fast Stencil-Code Computation on a Wafer-Scale Processor

OPENALEX - Publications

Kamil Rocki Dirk Van Essendelft Ilya Sharapov Robert Schreiber Michael Morrison and 5 more

The performance of CPU-based and GPU-based systems is often low for PDE codes, where large, sparse, structured linear equations must be solved. Iterative solvers are limited by data movement, both between caches memory nodes. Here we describe the solution such on Cerebras Systems CS-1, a wafer-scale processor that has bandwidth communication latency to perform well. We achieve 0.86 PFLOPS single system BiCGStab arising from 7-point finite difference stencil 600 × 595 1536 mesh, achieving...

10.1109/sc41405.2020.00062 article EN 2020-11-01

Breaking the mold: Overcoming the time constraints of molecular dynamics on general-purpose hardware

OPENALEX - Publications

Danny Pérez Aidan P. Thompson Stan Moore Tomas Oppelstrup Ilya Sharapov and 9 more

The evolution of molecular dynamics (MD) simulations has been intimately linked to that computing hardware. For decades following the creation MD, have improved with power along three principal dimensions accuracy, atom count (spatial scale), and duration (temporal scale). Since mid-2000s, computer platforms have, however, failed provide strong scaling for as scale-out central processing unit (CPU) graphics (GPU) substantial increases spatial scale do not lead proportional in temporal scale....

10.1063/5.0249193 article EN cc-by-nc-nd The Journal of Chemical Physics 2025-02-19

Performance Evaluation of NAS Parallel Benchmarks on Intel Xeon Phi

OPENALEX - Publications

Arunmoezhi Ramachandran Jérôme Vienne Rob Van der Wijngaart L. Koesterke Ilya Sharapov

NAS parallel benchmarks (NPB) are a set of applications commonly used to evaluate systems. We use the NPB-OpenMP version examine performance Intel's new Xeon Phi co-processor and focus in particular on many core aspect architecture. A first analysis studies scalability up 244 threads 61 cores impact affinity settings scaling. It also compares characteristics traditional CPUs. The application several well-established optimization techniques allows us identify common bottlenecks that can...

10.1109/icpp.2013.87 article EN 2013-10-01

Characteristics of workloads used in high performance and technical computing

OPENALEX - Publications

Razvan Cheveresan Matt Ramsay Chris Feucht Ilya Sharapov

This paper provides a systematic comparison of various characteristics computationally-intensive workloads. Our analysis focuses on standard HPC benchmarks and representative applications. For the selected workloads we provide wide range characterizations based instruction tracing hardware counter measurements.

10.1145/1274971.1274984 article EN 2007-06-17

Breaking the Molecular Dynamics Timescale Barrier Using a Wafer-Scale System

OPENALEX - Publications

Kylee Santos Stan Moore Tomas Oppelstrup A Sharifian Ilya Sharapov and 9 more

10.1109/sc41406.2024.00014 article EN 2024-11-17

Online Normalization for Training Neural Networks

OPENALEX - Publications

Vitaliy Chiley Ilya Sharapov Atli Kosson Urs Köster Ryan Reece and 3 more

Online Normalization is a new technique for normalizing the hidden activations of neural network. Like Batch Normalization, it normalizes sample dimension. While does not use batches, as accurate Normalization. We resolve theoretical limitation by introducing an unbiased computing gradient normalized activations. works with automatic differentiation adding statistical normalization primitive. This can be used in cases covered some other normalizers, such recurrent networks, fully connected...

10.48550/arxiv.1905.05894 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Wafer-Scale Fast Fourier Transforms

OPENALEX - Publications

Marcelo Orenes-Vera Ilya Sharapov Robert Schreiber Mathias Jacquelin Philippe Vandermersch and 1 more

We have implemented fast Fourier transforms for one, two, and three-dimensional arrays on the Cerebras CS-2, a system whose memory processing elements reside single silicon wafer. The wafer-scale engine (WSE) encompasses two-dimensional mesh of roughly 850,000 (PEs) with local equally nearest-neighbor interconnections. Our FFT (wsFFT) parallelizes $n^3$ problem up to $n^2$ PEs. At this point PE processes only vector 3D domain (known as pencil) per superstep, where each three supersteps...

10.1145/3577193.3593708 preprint EN 2023-06-20

Subspace correction multi‐level methods for elliptic eigenvalue problems

OPENALEX - Publications

Tony F. Chan Ilya Sharapov

Abstract In this work, we apply the ideas of domain decomposition and multi‐grid methods to PDE‐based eigenvalue problems represented in two equivalent variational formulations. To find lowest eigenpair, use a “subspace correction” framework for deriving multiplicative algorithm minimizing Rayleigh quotient current iteration. By considering an minimization formulation proposed by Mathew Reddy, can theory Schwarz algorithms non‐linear optimization developed Tai Espedal analyse convergence...

10.1002/nla.238 article EN Numerical Linear Algebra with Applications 2001-11-22

A case study in top-down performance estimation for a large-scale parallel application

OPENALEX - Publications

Ilya Sharapov Robert Kroeger Guy Delamarter Razvan Cheveresan Matthew Ramsay

This work presents a general methodology for estimating the performance of an HPC workload when running on future hardware architecture. Further, it demonstrates by significant scientific application -- Gyrokinetic Toroidal Code (GTC) executing Sun's proposed next-generation petascale computer architecture.For GTC, we identify important phases iteration and perform low-level analysis that includes instruction tracing component simulations processor memory systems. Low-level is complemented...

10.1145/1122971.1122985 article EN 2006-03-29

High-Scalability Parallelization of a Molecular Modeling Application: Performance and Productivity Comparison Between OpenMP and MPI Implementations

OPENALEX - Publications

Russell A. Brown Ilya Sharapov

10.1007/s10766-007-0057-y article EN International Journal of Parallel Programming 2007-07-19

Breaking the Molecular Dynamics Timescale Barrier Using a Wafer-Scale System

OPENALEX - Publications

Kylee Santos Stan Moore Tomas Oppelstrup A Sharifian Ilya Sharapov and 9 more

Molecular dynamics (MD) simulations have transformed our understanding of the nanoscale, driving breakthroughs in materials science, computational chemistry, and several other fields, including biophysics drug design. Even on exascale supercomputers, however, runtimes are excessive for systems timescales scientific interest. Here, we demonstrate strong scaling MD Cerebras Wafer-Scale Engine. By dedicating a processor core each simulated atom, 179-fold improvement timesteps per second versus...

10.48550/arxiv.2405.07898 preprint EN arXiv (Cornell University) 2024-05-13

Breaking the mold: overcoming the time constraints of molecular dynamics on general-purpose hardware

OPENALEX - Publications

Danny Pérez Aidan P. Thompson Stan Moore Tomas Oppelstrup Ilya Sharapov and 9 more

The evolution of molecular dynamics (MD) simulations has been intimately linked to that computing hardware. For decades following the creation MD, have improved with power along three principal dimensions accuracy, atom count (spatial scale), and duration (temporal scale). Since mid-2000s, computer platforms however failed provide strong scaling for MD as scale-out CPU GPU substantial increases spatial scale do not lead proportional in temporal scale. Important scientific problems therefore...

10.48550/arxiv.2411.10532 preprint EN arXiv (Cornell University) 2024-11-15

A preconditioning technique for indefinite linear systems

OPENALEX - Publications

Louis Komzsik Petra Poschmann Ilya Sharapov

10.1016/s0168-874x(97)00084-x article EN Finite Elements in Analysis and Design 1997-06-01

Disruptive Changes in Field Equation Modeling: A Simple Interface for Wafer Scale Engines

OPENALEX - Publications

Mino Woo Terry G. Jordan Robert D. Schreiber Ilya Sharapov Shaheer Muhammad and 3 more

We present a high-level and accessible Application Programming Interface (API) for the solution of field equations on Cerebras Systems Wafer-Scale Engine (WSE) with over two orders magnitude performance gain relative to traditional distributed computing approaches. The domain-specific API is called WSE Field-equation (WFA). WFA outperforms OpenFOAM NETL's Joule 2.0 supercomputer by in time solution. While this consistent hand-optimized assembly codes, provides an easy-to-use, Python...

10.48550/arxiv.2209.13768 preprint EN other-oa arXiv (Cornell University) 2022-01-01

ISPD 2021 Wafer-Scale Physics Modeling Contest

OPENALEX - Publications

Patrick Groeneveld Michael James Vladimir Kibardin Ilya Sharapov Marvin Tom and 1 more

Solving 3-D partial differential equations in a Finite Element model is computationally intensive and requires extremely high memory communication bandwidth. This paper describes novel way where the mesh points of varying resolution are mapped on large 2-D homogenous array processors. Cerebras developed supercomputer that powered by 21.5cm Wafer-Scale Engine (WSE) with 850,000 programmable compute cores. With 2.6 trillion transistors 7nm process this far largest chip world. It structured as...

10.1145/3439706.3446904 article EN 2021-03-12

Fast Stencil-Code Computation on a Wafer-Scale Processor

OPENALEX - Publications

Kamil Rocki Dirk Van Essendelft Ilya Sharapov Robert D. Schreiber Michael L. Morrison and 5 more

10.48550/arxiv.2010.03660 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Coming Soon ...