- Parallel Computing and Optimization Techniques
- Distributed and Parallel Computing Systems
- Logic, programming, and type systems
- Advanced Data Storage Technologies
- Scientific Computing and Data Management
- Model-Driven Software Engineering Techniques
- Cloud Computing and Resource Management
- Software Engineering Research
- Ferroelectric and Negative Capacitance Devices
- Formal Methods in Verification
- Tensor decomposition and applications
- Software System Performance and Reliability
- Computational Physics and Python Applications
- Advanced Numerical Methods in Computational Mathematics
- Advanced Database Systems and Queries
- Constraint Satisfaction and Optimization
- Embedded Systems Design Techniques
- Handwritten Text Recognition Techniques
- Natural Language Processing Techniques
- Optimization and Search Problems
- Data Visualization and Analytics
- Distributed systems and fault tolerance
- Evolutionary Algorithms and Applications
- Algorithms and Data Compression
- Computational Geometry and Mesh Generation
Google (United States)
2020
Institut national de recherche en informatique et en automatique
2016-2019
École Normale Supérieure - PSL
2019
Inria Saclay - Île de France
2016-2018
Laboratoire de l'Informatique du Parallélisme
2017
Laboratoire de Recherche en Informatique
2014-2016
Université Paris-Sud
2016
Université Paris-Saclay
2016
National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”
2012
This work presents MLIR, a novel approach to building reusable and extensible compiler infrastructure. MLIR addresses software fragmentation, compilation for heterogeneous hardware, significantly reducing the cost of domain specific compilers, connecting existing compilers together. facilitates design implementation code generators, translators optimizers at different levels abstraction across application domains, hardware targets execution environments. The contribution this includes (1)...
Deep learning models with convolutional and recurrent networks are now ubiquitous analyze massive amounts of audio, image, video, text graph data, applications in automatic translation, speech-to-text, scene understanding, ranking user preferences, ad placement, etc. Competing frameworks for building these such as TensorFlow, Chainer, CNTK, Torch/PyTorch, Caffe1/2, MXNet Theano, explore different tradeoffs between usability expressiveness, research or production orientation supported...
This work presents MLIR, a novel approach to building reusable and extensible compiler infrastructure. MLIR aims address software fragmentation, improve compilation for heterogeneous hardware, significantly reduce the cost of domain specific compilers, aid in connecting existing compilers together. facilitates design implementation code generators, translators optimizers at different levels abstraction also across application domains, hardware targets execution environments. The contribution...
Most compilers have a single core intermediate representation (IR) (e.g., LLVM) sometimes complemented with vaguely defined IR-like data structures. This IR is commonly low-level and close to machine instructions. As result, optimizations relying on domain-specific information are either not possible or require complex analysis recover the missing information. In contrast, multi-level rewriting instantiates hierarchy of dialects (IRs), lowers programs level-by-level, performs code...
We present Polygeist, a new compilation flow that connects the MLIR compiler infrastructure to cutting edge polyhedral optimization tools. It consists of C and C++ frontend capable converting broad range existing codes into suitable for transformation bi-directional conversion between OpenScop exchange format. The Polygeist/MLIR intermediate representation featuring high-level (affine) loop constructs n-D arrays embedded single static assignment (SSA) substrate enables an unprecedented...
Deep learning frameworks automate the deployment, distribution, synchronization, memory allocation, and hardware acceleration of models represented as graphs computational operators. These operators wrap high-performance libraries such cuDNN or NNPACK. When computation does not match any predefined library call, custom must be implemented, often at high engineering cost performance penalty, limiting pace innovation. To address this productivity gap, we propose evaluate: (1) a domain-specific...
While parallelism remains the main source of performance, architectural implementations and programming models change with each new hardware generation, often leading to costly application re-engineering. Most tools for performance portability require manual porting yet another model.
While compilers offer a fair trade-off between productivity and executable performance in single-threaded execution, their optimizations remain fragile when addressing compute-intensive code for parallel architectures with deep memory hierarchies. Moreover, these operate as black boxes, impenetrable the user, leaving them no alternative to time-consuming error-prone manual optimization cases where an imprecise cost model or weak analysis resulted bad decision. To address this issue, we...
The construction of effective loop nest optimizers and parallelizers remains challenging despite decades work in the area. Due to increasing diversity loop-intensive applications complex memory/computation hierarchies modern processors, optimization heuristics are pulled towards conflicting goals, highlighting lack a systematic approach optimizing locality parallelism. Acknowledging these demands on optimization, we propose an algorithmic template capable modeling multi-level parallelism...
Multi-level intermediate representations (IR) show great promise for lowering the design costs domain-specific compilers by providing a reusable, extensible, and non-opini-onated framework expressing high-level abstractions directly in IR. But, while such frameworks support progressive of to low-level IR, they do not raise opposite direction. Thus, entry point into compilation pipeline defines highest level abstraction all subsequent transformations, limiting set applicable optimizations,...
Increasingly complex hardware makes the design of effective compilers difficult. To reduce this problem, we introduce Declarative Loop Tactics , which is a novel framework composable program transformations based on an internal tree-like representation polyhedral compiler. The declarative C++ API built around easy-to-program matchers and builders, provide foundation to develop loop optimization strategies. Using our express computational patterns core building blocks, such as tiling, fusion,...
Numerical simulation often resorts to iterative in-place stencils such as the Gauss-Seidel or Successive Overrelaxation (SOR) methods. Writing high performance implementations of requires significant effort and time; it also involves non-local transformations beyond stencil kernel itself. While automated code generation is a mature technology for image processing stencils, convolutions out-of-place (such Jacobi method), optimization manual craftsmanship. Building on recent advances in tensor...
Parallel systems are now omnipresent and their effective use requires significant effort expertise from software developers. Multitude of languages libraries offer convenient ways to express parallelism, but fall short at helping programmers find parallelism in existing programs. To address this issue, we introduce Clint, a direct manipulation tool aimed ease both the extraction expression parallelism. Clint builds on polyhedral representation programs convey dynamic behavior, perform...
Parallelism is one of the key performance sources in modern computer systems. When heuristics-based automatic parallelization fails to improve performance, a cumbersome and error-prone manual transformation often required. As solution, we propose an interactive visual approach building on polyhedral model that visualizes exact dependencies parallelism; decomposes replays complex automatically computed step by step; allows for directly manipulating representation as means transforming program...