NFDI4DS | UHH-SEMS - Publication Details

Julien Jaeger

ORCID: 0000-0003-0084-1574

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5078100396

Research Areas

Parallel Computing and Optimization Techniques
Advanced Data Storage Technologies
Distributed and Parallel Computing Systems
Distributed systems and fault tolerance
Cloud Computing and Resource Management
Embedded Systems Design Techniques
Software System Performance and Reliability
Advanced Memory and Neural Computing
Ferroelectric and Negative Capacitance Devices
Scientific Computing and Data Management
Advanced Mathematical Identities
Modular Robots and Swarm Intelligence
Radiation Effects in Electronics
Caching and Content Delivery
Algorithms and Data Compression
Interconnection Networks and Systems
Image and Signal Denoising Methods
Analytic Number Theory Research
Data Visualization and Analytics
Advanced Data Compression Techniques
Peer-to-Peer Network Technologies
Real-Time Systems Scheduling
Coding theory and cryptography

Commissariat à l'Énergie Atomique et aux Énergies Alternatives
2015-2024

Université Paris-Saclay
2014-2023

CEA DAM Île-de-France
2014-2023

Maison de la Simulation
2020-2022

CEA Paris-Saclay
2020-2022

Lawrence Livermore National Laboratory
2020

Institut Polytechnique de Bordeaux
2020

Intel (United States)
2020

Institut Lavoisier de Versailles
2012

University of California, Irvine
2012

An MPI Halo-Cell Implementation for Zero-Copy Abstraction

OPENALEX - Publications

Jean-Baptiste Besnard Allen D. Malony Sameer Shende Marc Pérache Patrick Carribault and 1 more

In the race for Exascale, advent of many-core processors will bring a shift in parallel computing architectures to systems much higher concurrency, but with relatively smaller memory per thread. This raises concerns adaptability HPC software, current generation brave new world. this paper, we study domain splitting on an increasing number areas as example problem where negative performance impact computation could arise. We identify specific parameters that drive scalability problem, and...

10.1145/2802658.2802669 article EN 2015-09-01

Exposition, clarification, and expansion of MPI semantic terms and conventions

OPENALEX - Publications

Purushotham Bangalore Rolf Rabenseifner Daniel Holmes Julien Jaeger Guillaume Mercier and 2 more

This paper offers a timely study and proposed clarifications, revisions, enhancements to the Message Passing Interface's (MPI's) Semantic Terms Conventions. To enhance MPI, clearer understanding of meaning key terminology has proven essential, and, surprisingly, important concepts remain underspecified, ambiguous in some cases, inconsistent and/or conflicting despite 26 years standardization. work addresses these concerns comprehensively usefully informs MPI developers, implementors, those...

10.1145/3343211.3343213 preprint EN 2019-08-23

IO-SEA: Storage I/O and Data Management for Exascale Architectures

OPENALEX - Publications

Daniel Araújo De Medeiros Eric B. Gregory Philippe Couvée James Hawkes Sébastien Gougeaud and 44 more

10.1145/3637543.3654620 article EN 2024-05-07

Checkpoint/restart approaches for a thread-based MPI runtime

OPENALEX - Publications

Julien Adam Maxime Kermarquer Jean-Baptiste Besnard Leonardo Bautista-Gomez Marc Pérache and 4 more

10.1016/j.parco.2019.02.006 article EN Parallel Computing 2019-03-01

Introducing Task-Containers as an Alternative to Runtime-Stacking

OPENALEX - Publications

Jean-Baptiste Besnard Julien Adam Sameer Shende Marc Pérache Patrick Carribault and 2 more

The advent of many-core architectures poses new challenges to the MPI programming model which has been designed for distributed memory message passing. It is now clear that will have evolve in order exploit shared-memory parallelism, either by collaborating with other models (MPI+X) or introducing approaches. This paper considers extensions C and C++ make it possible Processes run into threads. More generally, a thread-local storage (TLS) library developed simplify collocation arbitrary...

10.1145/2966884.2966910 article EN 2016-09-25

Fine‐grain data management directory for OpenMP 4.0 and OpenACC

OPENALEX - Publications

Julien Jaeger Patrick Carribault Marc Pérache

Summary Today's trend to use accelerators in heterogeneous systems forces a paradigm shift programming models. The of low‐level APIs for accelerator is tedious and not intuitive casual programmers. To tackle this problem, recent approaches focused on high‐level directive‐based models, with standardization effort made OpenACC the directives latest OpenMP 4.0 release. pragmas data management automatically handle exchange between host device. keep runtime simple efficient, severe restrictions...

10.1002/cpe.3352 article EN Concurrency and Computation Practice and Experience 2014-08-13

Transparent High-Speed Network Checkpoint/Restart in MPI

OPENALEX - Publications

Julien Adam Jean-Baptiste Besnard Allen D. Malony Sameer Shende Marc Pérache and 2 more

Fault-tolerance has always been an important topic when it comes to running massively parallel programs at scale. Statistically, hardware and software failures are expected occur more often on systems gathering millions of computing units. Moreover, the larger jobs are, hours would be wasted by a crash. In this paper, we describe work done in our MPI runtime enable transparent checkpointing mechanism. Unlike 4.0 User-Level Failure Mitigation (ULFM) interface, targets solely...

10.1145/3236367.3236383 article EN 2018-09-19

Correctness Analysis of MPI-3 Non-Blocking Communications in PARCOACH

OPENALEX - Publications

Julien Jaeger Emmanuelle Saillard Patrick Carribault Denis Barthou

MPI-3 provide functions for non-blocking collectives. To help programmers introduce collectives to existing MPI programs, we improve the PARCOACH tool checking correctness of call sequences. These enhancements focus on correct sequences all flavor collective calls, and presence completion calls communications. The evaluation shows an overhead under 10% original compilation time.

10.1145/2802658.2802674 article EN 2015-09-01

Automatic efficient data layout for multithreaded stencil codes on CPU sand GPUs

OPENALEX - Publications

Julien Jaeger Denis Barthou

Stencil based computation on structured grids is a kernel at the heart of large number scientific applications. The variety stencil kernels used in practice make this pattern difficult to assemble into high performance computing library. With multiplication cores single chip, answering architectural alignment requirements became an even more important key performance. Along with vector accesses, data layout optimization must also consider concurrent parallel accesses. In paper, we develop...

10.1109/hipc.2012.6507504 preprint EN 2012-12-01

Parallel expression template for large vectors

OPENALEX - Publications

Laurent Plagne Frank Hülsemann Denis Barthou Julien Jaeger

This paper describes a short and simple way of improving the performance vector operations (e.g. X = aY +bZ +..) applied to large vectors. In previous [1] we described how take advantage high copy operation provided by ATLAS library [2] in context C++ Expression Template (ET) mechanism. Here present multi-threaded implementation this approach. The proposed ET that involves parallel blocking technique, leads significant increase compared existing implementations (up x2.7) on dual socket...

10.1145/1595655.1595663 preprint EN 2009-07-07

Partitioned Collective Communication

OPENALEX - Publications

Daniel Holmes Anthony Skjellum Julien Jaeger Ryan E. Grant Purushotham Bangalore and 3 more

Partitioned point-to-point communication and persistent collective were both recently standardized in MPI-4.0. Each offers performance scalability advantages over MPI-3.1-based when planned transfers are feasible an MPI application. Their merger into a generalized, with partitions is logical next step, significant for portability. Non-trivial decisions about the syntax semantics of such operations need to be addressed, including scope knowledge partitioning choices by members communicator's...

10.1109/exampi54564.2021.00007 article EN 2021-11-01

PARCOACH Extension for Static MPI Nonblocking and Persistent Communication Validation

OPENALEX - Publications

Van Man Nguyen Emmanuelle Saillard Julien Jaeger Denis Barthou Patrick Carribault

The Message Passing Interface (MPI) is a parallel programming model used to exchange data between working units in different nodes of supercomputer. While MPI blocking operations return when the communication complete, non-blocking and persistent before enabling developer hide latency. However usage these latter comes with additional rules user has abide to. This error prone, which makes verification tools valuable for program writers. PARCOACH framework that detects collective errors using...

10.1109/correctness51934.2020.00009 article EN 2020-11-01

MPI detach — Towards automatic asynchronous local completion

OPENALEX - Publications

Joachim Protze Marc-André Hermanns Matthias Müller Van Man Nguyen Julien Jaeger and 3 more

10.1016/j.parco.2021.102859 article EN Parallel Computing 2021-10-30

Profile-guided scope-based data allocation method

OPENALEX - Publications

Hugo Brunie Julien Jaeger Patrick Carribault Denis Barthou

The complexity of High Performance Computing nodes memory system increases in order to challenge application growing usage and increasing gap between computation access speeds. As these technologies are just being introduced HPC supercomputers no one knows if it is better manage them with hardware or software solutions. Thus both studied parallel. For solutions, the problem consists choosing which data store on at any time.

10.1145/3240302.3240313 preprint EN Proceedings of the International Symposium on Memory Systems 2018-10-01

Study on progress threads placement and dedicated cores for overlapping MPI nonblocking collectives on manycore processor

OPENALEX - Publications

Alexandre Denis Julien Jaeger Emmanuel Jeannot Marc Pérache Hugo Taboada

To amortize the cost of MPI collective operations, nonblocking collectives have been proposed so as to allow communications be overlapped with computation. Unfortunately, are more CPU-hungry than point-to-point and running them in a communication thread on dedicated CPU core makes slow. On other hand, application cores leads no overlap. In this article, we propose placement algorithms for progress threads that do not degrade performance when get communication/computation We first show even...

10.1177/1094342019860184 article EN The International Journal of High Performance Computing Applications 2019-07-02

Towards leveraging collective performance with the support of MPI 4.0 features in MPC

OPENALEX - Publications

Stéphane Bouhrour Thibaut Pepin Julien Jaeger

10.1016/j.parco.2021.102860 article EN publisher-specific-oa Parallel Computing 2021-10-27

Towards a Better Expressiveness of the Speedup Metric in MPI Context

OPENALEX - Publications

Jean-Baptiste Besnard Allen D. Malony Sameer Shende Marc Pérache Patrick Carribault and 1 more

Many-core processors are imposing new constraints to parallel applications. In particular, the MPI+X model or hybridization is becoming a compulsory avenue extract performance by mitigating both memory and communication overhead. this context, tools also have evolve in order represent more complex states combining multiple runtimes programming models. paper, we propose start from well-known metric, Speedup, showing that it can be bounded acceleration of any program section. From observation,...

10.1109/icppw.2017.45 article EN 2017-08-01

Implementation and performance evaluation of MPI persistent collectives in MPC: a case study

OPENALEX - Publications

Stéphane Bouhrour Julien Jaeger

Persistent collective communications have recently been voted in the MPI standard, opening door to many optimizations reduce collectives cost, particular for recurring operations. Indeed persistent semantics contains an initialization phase called only once a specific collective. It can be used collect building costs necessary collective, avoid paying them each time operation is performed.

10.1145/3416315.3416321 article EN 2020-09-21

Coming Soon ...