- Parallel Computing and Optimization Techniques
- Software Testing and Debugging Techniques
- Logic, programming, and type systems
- Advanced Data Storage Technologies
- Stellar, planetary, and galactic studies
- Embedded Systems Design Techniques
- Distributed systems and fault tolerance
- Interconnection Networks and Systems
- Astrophysics and Star Formation Studies
- Cloud Computing and Resource Management
- Software System Performance and Reliability
- Stochastic processes and statistical mechanics
- Low-power high-performance VLSI design
- Numerical Methods and Algorithms
- Scientific Research and Discoveries
- Service-Oriented Architecture and Web Services
- Distributed and Parallel Computing Systems
- Astronomy and Astrophysical Research
- Theoretical and Computational Physics
- Software Engineering Research
- Advanced Malware Detection Techniques
- Galaxies: Formation, Evolution, Phenomena
- Astro and Planetary Science
- Electromagnetic Simulation and Numerical Methods
- Advanced Software Engineering Methodologies
IBM Research - Thomas J. Watson Research Center
1997-2023
IBM (United States)
1997-2020
University of Toronto
2003
Syracuse University
1993-2002
University of Arizona
1995
SRON Netherlands Institute for Space Research
1995
Autonomic computing systems are designed to be self-diagnosing and self-healing, such that they detect performance correctness problems, identify their causes, apply the appropriate remedy. These abilities can improve performance, uptime, security, while simultaneously reducing effort skills required of system administrators. One way support these is by allowing monitoring code, diagnostic function implementations dynamically inserted removed in live systems. This "hot swapping" avoids...
Whenever the need to compile a new dynamically typed language arises, an appealing option is repurpose existing statically Just-In-Time (JIT) compiler (repurposed JIT compiler). Existing repurposed compilers (RJIT compilers), however, have not yet delivered hoped-for performance boosts. The of JVM languages, for instance, often lags behind standard interpreter implementations. Even more customized solutions that extend internals target compete poorly with those designed specifically...
Sparse matrix-matrix multiplication (SpGEMM) is a widely used kernel in various graph, scientific computing and machine learning algorithms. It well known that SpGEMM memory-bound operation, its peak performance expected to be bound by the memory bandwidth. Yet, existing algorithms fail saturate bandwidth, resulting suboptimal under Roofline model. In this paper, we characterize based on their access patterns develop practical lower upper bounds for performance. We then an algorithm outer...
Applications written in dynamically typed scripting languages are increasingly popular for Web software development. Even on the server side, programmers using such as Ruby and Python to build complex applications quickly. As number complexity of language grows, optimizing their performance is becoming important. Some best performing compilers optimizers developed entirely from scratch target a specific language. This approach not scalable, given variety languages, effort involved developing...
The small perpendicular distortions in a large disc galaxy, such as the Milky Way, that are caused by an orbiting intermediate-mass companion Large Magellanic Cloud (LMC) have been modelled with parallel computer implementation of three-dimensinal N-body particle treecode. model demonstrates mass fraction 7.5 per cent Galaxy and orbital inclination 45° can generate height velocity perturbations inner primary galaxy order several hundred pc ~10 km −1, respectively, relative to unperturbed...
The interaction between the dwarf galaxy in Sagittarius and Milky Way Galaxy has been modelled with a parallel computer implementation of an N-body treecode. Models are made that reproduce observed position, size, velocity, proper motion velocity gradient its likely pre-disc encounter, other models studied which just passed through disc. Several observable differences these cases found. In pre-collision case, is bound to it disc previously 1.7 × 108 yr ago anticentre direction. It will cross...
The IBM POWER9 architecture offers a substantial set of novel and performance-improvement features that are made available to both scale-up scale-out applications via system software. These provide significant performance improvements for cognitive, cloud, virtualization workloads, many which use dynamic scripting languages. In this paper, we describe some the key features.
Power ISA(TM) Version 3.1 has introduced a new family of matrix math instructions, collectively known as the Matrix-Multiply Assist (MMA) facility. The instructions in this facility implement numerical linear algebra operations on small matrices and are meant to accelerate computation-intensive kernels, such multiplication, convolution discrete Fourier transform. These have led power- area-efficient implementation high throughput engine future POWER10 processor. Performance per core is 4...
Summary form only given. A hot-swappable component is one that can be replaced with a new or different implementation while the system running and actively using component. For example, of TCP/IP protocol stack, when hot-swappable, (perhaps to handle denial-of-service attacks improve performance), without disturbing existing network connections. The capability swap components offers number potential advantages such as: online upgrades for high availability systems, improved performance due...
Applications written in dynamically typed scripting languages are increasingly popular for Web software development. Even on the server side, programmers using such as Ruby and Python to build complex applications quickly. As number complexity of language grows, optimizing their performance is becoming important. Some best performing compilers optimizers developed entirely from scratch target a specific language. This approach not scalable, given variety languages, effort involved developing...
Whenever the need to compile a new dynamically typed language arises, an appealing option is repurpose existing statically Just-In-Time (JIT) compiler (repurposed JIT compiler). Existing repurposed compilers (RJIT compilers), however, have not yet delivered hoped-for performance boosts. The of JVM languages, for instance, often lags behind standard interpreter implementations. Even more customized solutions that extend internals target compete poorly with those designed specifically...
The GCC (GNU Compiler Collection) project of the Free Software Foundation has resulted in one most widespread compilers use today that is capable generating code for a variety platforms. Since 1987, many volunteers from academia and private sector have been working to continuously improve functionality quality GCC. Some compiler's key components were, continue be, developed at IBM Research laboratories. We review several IBM's contributions compiler, including generator zSeries® processor...
New adaptive mesh refinement algorithms provide an opportunity to utilize the same hierarchical tree-structures developed for multipole-based particle simulations in grid-based of both continuum and problems. Representing a multipole method simulation with this structure provides natural formalism which unite these two classes solvers. This paper discusses how methods exploit basic principle locality evident many systems, such as those governed by Poisson's Equation, introduces issues...
In this paper, we present a comprehensive security architecture, Flexible Secure Execution Environment (FlexSEE), for confidential computing in modern cloud environments. FlexSEE does not require the trust of system software on compute server and guarantees that user data is visible only non-privileged mode to designated program trusted by owner hardware, thus protecting from an untrusted hypervisor, OS, or other users' applications, server.
C++ has gained broad acceptance as an object-oriented evolutionary extension to the C language, but it severely constrains methods for operating on class objects by forcing all data manipulation through interface which assumes that basic operations can be implemented they are written: unary or binary operators. allows great flexibility in creation of complex structures perform same functionality built-in types many other languages unfortunately does not allow equivalent level feasibility so...
Sparse matrix-matrix multiplication (SpGEMM) is a widely used kernel in various graph, scientific computing and machine learning algorithms. It well known that SpGEMM memory-bound operation, its peak performance expected to be bound by the memory bandwidth. Yet, existing algorithms fail saturate bandwidth, resulting suboptimal under Roofline model. In this paper we characterize based on their access patterns develop practical lower upper bounds for performance. We then an algorithm outer...