Jacques A. Pienaar

ORCID: 0000-0003-0443-7624
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Parallel Computing and Optimization Techniques
  • Distributed and Parallel Computing Systems
  • Cloud Computing and Resource Management
  • Logic, programming, and type systems
  • Scientific Computing and Data Management
  • Advanced Data Storage Technologies
  • Tensor decomposition and applications
  • Security and Verification in Computing
  • Data Analysis with R
  • Advanced biosensing and bioanalysis techniques
  • Software Testing and Debugging Techniques
  • Algorithms and Data Compression
  • DNA and Biological Computing
  • Cellular Automata and Applications
  • Distributed systems and fault tolerance
  • Embedded Systems Design Techniques
  • Computational Physics and Python Applications

Google (United States)
2016-2022

Purdue University West Lafayette
2011-2013

North-West University
2008-2009

This work presents MLIR, a novel approach to building reusable and extensible compiler infrastructure. MLIR addresses software fragmentation, compilation for heterogeneous hardware, significantly reducing the cost of domain specific compilers, connecting existing compilers together. facilitates design implementation code generators, translators optimizers at different levels abstraction across application domains, hardware targets execution environments. The contribution this includes (1)...

10.1109/cgo51591.2021.9370308 article EN 2021-02-27

This work presents MLIR, a novel approach to building reusable and extensible compiler infrastructure. MLIR aims address software fragmentation, improve compilation for heterogeneous hardware, significantly reduce the cost of domain specific compilers, aid in connecting existing compilers together. facilitates design implementation code generators, translators optimizers at different levels abstraction also across application domains, hardware targets execution environments. The contribution...

10.48550/arxiv.2002.11054 preprint EN cc-by arXiv (Cornell University) 2020-01-01

Graphics Processing Units have emerged as powerful accelerators for massively parallel, numerically intensive workloads. The two dominant software models these devices are NVIDIA's CUDA and the cross-platform OpenCL standard. Until now, there has not been a fully open-source compiler targeting environment, hampering general architecture research making deployment difficult in datacenter or supercomputer environments. In this paper, we present gpucc, an LLVM-based, open-source, compatible...

10.1145/2854038.2854041 article EN 2016-02-29

One of the major optimizations employed in deep learning frameworks is graph rewriting. Production rely on heuristics to decide if rewrite rules should be applied and which order. Prior research has shown that one can discover more optimal tensor computation graphs we search for a better sequence substitutions instead relying heuristics. However, observe existing approaches superoptimization both production apply sequential manner. Such methods are sensitive order often only explore small...

10.48550/arxiv.2101.01332 preprint EN other-oa arXiv (Cornell University) 2021-01-01

We present a runtime framework for the execution of work-loads represented as parallel-operator directed acyclic graphs (PO-DAGs) on heterogeneous multi-core platforms. PO-DAGs combine coarse-grained parallelism at graph level with fine-grained within each node, lending naturally to exploiting intra --- and inter-processing element in identify four important criteria - Suitability, Locality, Availability Criticality (SLAC) show that all these must be considered by order achieve good...

10.1145/1995896.1995933 article EN 2011-05-31

JavaScript is the dominant language for implementing dynamic web pages in browsers. Even though it standardized, many browsers implement and browser bindings different incompatible ways. As a result, plethora of development frameworks were developed to hide cross-browser issues ease large applications. An unwelcome side-effect these that they can introduce memory leaks, despite fact garbage collected. Memory bloat major issue applications, as affects user perceived latency may even prevent...

10.1109/cgo.2013.6495007 article EN 2013-02-01

Pipelining is a well-known approach to increasing parallelism and performance. We address the problem of software pipelining for heterogeneous parallel platforms that consist different multi-core many-core processing units. In this context, involves two key steps -- partitioning an application into stages mapping scheduling onto units platform. show inter-dependency between these critical challenge must be addressed in order achieve high propose Automatic Heterogeneous framework (AHP)...

10.5555/2388996.2389029 article EN IEEE International Conference on High Performance Computing, Data, and Analytics 2012-11-10

Many computationally-intensive algorithms benefit from the wide parallelism offered by Graphical Processing Units (GPUs). However, search for a close-to-optimal implementation remains extremely tedious due to specialization and complexity of GPU architectures.

10.1145/3033019.3033023 preprint EN 2017-02-03

Pipelining is a well-known approach to increasing parallelism and performance. We address the problem of software pipelining for heterogeneous parallel platforms that consist different multi-core many-core processing units. In this context, involves two key steps -- partitioning an application into stages mapping scheduling onto units platform. show inter-dependency between these critical challenge must be addressed in order achieve high propose Automatic Heterogeneous framework (AHP)...

10.1109/sc.2012.22 article EN International Conference for High Performance Computing, Networking, Storage and Analysis 2012-11-01

Probability and statistics with R By Maria Dolores Ugarte, Ana F. Militino, Alan Arnholt, Boca Raton, Chapman & Hall/CRC Press, 2008, xxvi + 728 pp., £46.99 or US$89.95 (hardback), ISBN 1584888...

10.1080/02664760802416539 article EN Journal of Applied Statistics 2009-05-13

The computational power increases over the past decades havegreatly enhanced ability to simulate chemical reactions andunderstand ever more complex transformations. Tensor contractions are fundamental building block of these simulations. These simulations have often been tied one platform and restricted in generality by interface provided user. expanding prevalence accelerators researcher demands necessitate a general approach which is not specific hardware or requires contortion algorithms...

10.48550/arxiv.2102.06827 preprint EN cc-by-nc-nd arXiv (Cornell University) 2021-01-01

Synchronisation errors, that is, the insertion of additional or deletion valid symbols, are most difficult class errors to correct - containing additive as a subset. A regenerating algorithm, core channel demodulator, was developed for synchronisation error correcting codes. This codes is generalisation code originally by R.R. Varshamov and G.M. Tenengolts. In this article two designs provided regenerator general design has been successfully implemented tested on purpose computer (GPC) field...

10.1109/sibircon.2008.4602630 article EN IEEE Region International Conference on Computational Technologies in Electrical and Electronics Engineering 2008-07-01

10.1080/02664760802193336 article Journal of Applied Statistics 2008-10-10
Coming Soon ...