Christoph Keßler

ORCID: 0000-0001-5241-0026
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Parallel Computing and Optimization Techniques
  • Distributed and Parallel Computing Systems
  • Interconnection Networks and Systems
  • Cloud Computing and Resource Management
  • Embedded Systems Design Techniques
  • Advanced Data Storage Technologies
  • Algorithms and Data Compression
  • Software System Performance and Reliability
  • Advanced Software Engineering Methodologies
  • Real-Time Systems Scheduling
  • Distributed systems and fault tolerance
  • Air Quality and Health Impacts
  • Scientific Computing and Data Management
  • Formal Methods in Verification
  • Climate Change and Health Impacts
  • Logic, programming, and type systems
  • Graph Theory and Algorithms
  • Simulation Techniques and Applications
  • Quantum Computing Algorithms and Architecture
  • Computer Graphics and Visualization Techniques
  • Complex Network Analysis Techniques
  • Opinion Dynamics and Social Influence
  • Game Theory and Applications
  • Cellular Automata and Applications
  • Low-power high-performance VLSI design

Linköping University
2015-2024

Hasso Plattner Institute
2016-2017

Rhenish Institute for Environmental Research
2010-2014

University of Cologne
2010-2014

University of Kaiserslautern
2012

Linnaeus University
2011

Karlsruhe Institute of Technology
2011

Kreditanstalt für Wiederaufbau
2009

Universität Trier
1995-2003

Saarland University
1994-2002

We present SkePU, a C++ template library which provides simple and unified interface for specifying data-parallel computations with the help of skeletons on GPUs using CUDA OpenCL. The is also general enough to support other architectures, SkePU implements both sequential CPU parallel OpenMP backend. It supports multi-GPU systems.

10.1145/1863482.1863487 article EN 2010-09-25

Recent studies have shown an association of short-term exposure to fine particulate matter (PM) with transient increases in blood pressure (BP), but it is unclear whether long-term has effect on arterial BP and hypertension.

10.1289/ehp.1103564 article EN public-domain Environmental Health Perspectives 2011-08-09

Many modern parallel computing systems are heterogeneous at their node level. Such nodes may comprise general purpose CPUs and accelerators (such as, GPU, or Intel Xeon Phi) that provide high performance with suitable energy-consumption characteristics. However, exploiting the available of architectures be challenging. There various programming frameworks OpenMP, OpenCL, OpenACC, CUDA) selecting one is for a target context not straightforward. In this paper, we study empirically...

10.1145/3110355.3110356 article EN 2017-07-21

PEPPHER, a three-year European FP7 project, addresses efficient utilization of hybrid (heterogeneous) computer systems consisting multicore CPUs with GPU-type accelerators. This article outlines the PEPPHER performance-aware component model, performance prediction means, runtime system, and other aspects project. A larger example demonstrates portability approach across one to four GPUs.

10.1109/mm.2011.67 article EN IEEE Micro 2011-07-26

In this article we present SkePU 2, the next generation of C++ skeleton programming framework for heterogeneous parallel systems. We critically examine design and limitations 1 interface. a new, flexible type-safe, interface in source-to-source transformation tool which knows about 2 constructs such as skeletons user functions. demonstrate how compiler transforms programs to enable efficient execution on show enables new use-cases applications by increasing flexibility from 1, errors can be...

10.1007/s10766-017-0490-5 article EN cc-by International Journal of Parallel Programming 2017-01-28

SkePU is a C++ template library that provides simple and unified interface for specifying data-parallel computations with the help of skeletons on GPUs using CUDA OpenCL. The also general enough to support other architectures, implements both sequential CPU parallel OpenMP backend. It supports multi-GPU systems. Currently available in include map, reduce, mapreduce, map-with-overlap, maparray, scan. performance generated code comparable hand-written code, even more complex applications such...

10.1145/1984693.1984697 article EN 2011-05-21

We discuss three complementary approaches that can provide both portability and an increased level of abstraction for the programming heterogeneous multicore systems. Together, these also support performance portability, as currently investigated in EU FP7 project PEPPHER. In particular, we consider (1) a library-based approach, here represented by integration SkePU C++ skeleton library with StarPU runtime system dynamic scheduling selection suitable execution units parallel tasks; (2)...

10.5555/2492708.2493051 article EN Design, Automation, and Test in Europe 2012-03-12

We describe the principles of a novel framework for performance-aware composition sequential and explicitly parallel software components with implementation variants. Automatic results in table-driven that, each call component, looks up expected best variant, processor allocation schedule given current problem, group sizes. The dispatch tables are computed off-line at component deployment time by an interleaved dynamic programming algorithm from time-prediction meta-code provided supplier....

10.1002/cpe.1844 article EN Concurrency and Computation Practice and Experience 2011-09-22

10.1007/s10766-015-0357-6 article EN International Journal of Parallel Programming 2015-03-21

Exploiting effectively massively parallel architectures is a major challenge that stream programming can help facilitate. We investigate the problem of generating energy-optimal code for collection streaming tasks include parallelizable or moldable on generic manycore processor with dynamic discrete frequency scaling. Streaming task collections differ from classical sets in all are running concurrently, so cores typically run several scheduled round-robin at user level data-driven way. A...

10.1145/2687653 article EN ACM Transactions on Architecture and Code Optimization 2015-01-09

Abstract We present the third generation of C++-based open-source skeleton programming framework SkePU. Its main new features include skeletons, data container types, support for returning multiple objects from instances and user functions, specifying alternative platform-specific functions to exploit e.g. custom SIMD instructions, generalized scheduling variants multicore CPU backends, a cluster-backend targeting MPI interface provided by StarPU task-based runtime system. have also revised...

10.1007/s10766-021-00704-3 article EN cc-by International Journal of Parallel Programming 2021-05-19

10.1023/a:1026511306490 article EN The Journal of Supercomputing 2000-01-01

Cell Broadband Engine is a heterogeneous multicore processor for high-performance computing and gaming. Its architecture allows an impressive peak performance but, at the same time, makes it very hard to write efficient code. The need simultaneously exploit SIMD instructions, coordinate parallel execution of slave processors, overlap DMA memory traffic with computation, keep data properly aligned in memory, explicitly manage small on-chip buffers leads complex In this work, we adopt skeleton...

10.1145/1370082.1370088 article EN 2008-05-11

We discuss three complementary approaches that can provide both portability and an increased level of abstraction for the programming heterogeneous multicore systems. Together, these also support performance portability, as currently investigated in EU FP7 project PEPPHER. In particular, we consider (1) a library-based approach, here represented by integration SkePU C++ skeleton library with StarPU runtime system dynamic scheduling selection suitable execution units parallel tasks; (2)...

10.1109/date.2012.6176582 article EN Design, Automation & Test in Europe Conference & Exhibition (DATE), 2015 2012-03-01

The PEPPHER component model defines an environment for annotation of native C/C++ based components homogeneous and heterogeneous multicore manycore systems, including GPU multi-GPU systems. For the same computational functionality, captured as a component, different sequential explicitly parallel implementation variants using various types execution units might be provided, together with metadata such exposed tunable parameters. goal is to compose application from its that, depending on...

10.1109/sc.companion.2012.97 article EN 2012-11-01

This article describes a knowledge‐based system for automatic parallelization of wide class sequential numerical codes operating on vectors and dense matrices, execution distributed memory message‐passing multiprocessors. Its main feature is fast powerful pattern recognition tool that locally identifies frequently occurring computations programming concepts in the source code. also works dusty deck have been "encrypted" by former machine‐specific code transformations. Successful guides...

10.1155/1996/406379 article EN cc-by Scientific Programming 1995-08-22

In this paper we present two algorithms for integrated code generation clustered VLIW architectures. One algorithm is a heuristic based on genetic algorithms, the other integer linear programming. The performance of are compared portion Mediabench [10] benchmark suite. We found results to be within one or clock cycles from optimal cases where optimum known. addition produces in predictable time also when program fails.

10.1145/1361096.1361099 article EN 2008-03-13

In this work we report results from a new integrated method of automatically generating parallel code Modelica models by combining parallelization at two levels abstraction. Performing inline expansion Runge-Kutta solver combined with fine-grained automatic the right-hand side resulting equation system opens up possibilities for high performance code, which is becoming increasingly relevant when multi-core computers are commonplace. An implementation, in form backend module OpenModelica...

10.1145/1556444.1556451 article EN ACM SIGARCH Computer Architecture News 2008-12-20
Coming Soon ...