NFDI4DS | UHH-SEMS - Publication Details

Nicolas Derumigny

ORCID: 0000-0002-0224-4098

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5058893408

Research Areas

Parallel Computing and Optimization Techniques
Superconducting Materials and Applications
Cloud Computing and Resource Management
Embedded Systems Design Techniques
Advanced Data Storage Technologies
Distributed systems and fault tolerance
Radiation Effects in Electronics
Distributed and Parallel Computing Systems
Service-Oriented Architecture and Web Services
Mobile Agent-Based Network Management
Interconnection Networks and Systems
Ferroelectric and Negative Capacitance Devices
Software System Performance and Reliability

Département d'Informatique
2024

Télécom Paris
2024

Institut Polytechnique de Paris
2024

Telecom SudParis
2024

Colorado State University
2021-2022

Université Grenoble Alpes
2022

Institut polytechnique de Grenoble
2022

Laboratoire d'Informatique de Grenoble
2022

Centre Inria de l'Université Grenoble Alpes
2022

Centre National de la Recherche Scientifique
2022

The gem5 Simulator: Version 20.0+

OPENALEX - Publications

Jason Lowe-Power Abdul Mutaal Ahmad Ayaz Akram Mohammad Alian Rico Amslinger and 73 more

The open-source and community-supported gem5 simulator is one of the most popular tools for computer architecture research. This simulation infrastructure allows researchers to model modern hardware at cycle level, it has enough fidelity boot unmodified Linux-based operating systems run full applications multiple architectures including x86, Arm, RISC-V. been under active development over last nine years since original release. In this time, there have 7500 commits codebase from 250 unique...

10.48550/arxiv.2007.03152 preprint EN cc-by arXiv (Cornell University) 2020-01-01

PALMED: Throughput Characterization for Superscalar Architectures

OPENALEX - Publications

Nicolas Derumigny Théophile Bastian Fabian M. Gruber Guillaume Iooss Christophe Guillon and 2 more

In a super-scalar architecture, the scheduler dynamically assigns micro-operations $( \mu$ OPs) to execution ports. The port mapping of an architecture describes how instruction decomposes into $\mu$ OPs and lists for each OP set ports it can be mapped to. It is used by compilers performance debugging tools characterize throughput sequence instructions repeatedly executed as core component loop.This paper introduces dual equivalent representation: resource abstract model where, executed,...

10.1109/cgo53902.2022.9741289 preprint EN 2022-03-29

An Irredundant and Compressed Data Layout to Optimize Bandwidth Utilization of FPGA Accelerators

OPENALEX - Publications

Corentin Ferry Nicolas Derumigny Steven Derrien Sanjay Rajopadhye

Memory bandwidth is known to be a performance bottleneck for FPGA accelerators, especially when they deal with large multi-dimensional data-sets. A body of work focuses on reducing off-chip transfers, but few authors try improve the efficiency transfers. This paper addresses later issue by proposing (i) compiler-based approach accelerator's data layout maximize contiguous access memory, and (ii) packing runtime compression techniques that take advantage this further memory performance. We...

10.48550/arxiv.2401.12071 preprint EN other-oa arXiv (Cornell University) 2024-01-01

Performance Debugging through Microarchitectural Sensitivity and Causality Analysis

OPENALEX - Publications

Alban Dutilleul Hugo Pompougnac Nicolas Derumigny Gabriel Rodríguez Valentin Trophime and 2 more

Modern Out-of-Order (OoO) CPUs are complex systems with many components interleaved in non-trivial ways. Pinpointing performance bottlenecks and understanding the underlying causes of program issues critical tasks to fully exploit offered by hardware resources. Current debugging approaches rely either on measuring resource utilization, order estimate which parts a CPU induce limitations, or code-based analysis deriving bottleneck information from capacity/throughput models. These limited...

10.48550/arxiv.2412.13207 preprint EN cc-by arXiv (Cornell University) 2024-12-03

Userland Page Table - A Key for Transparent Persistent Memory

OPENALEX - Publications

Jana Toljaga Nicolas Derumigny Yohan Pipereau Mathieu Bacou Gaël Thomas

10.1145/3704440.3704774 article EN 2024-12-02

PALMED: Throughput Characterization for Superscalar Architectures -- Extended Version

OPENALEX - Publications

Nicolas Derumigny Fabian M. Gruber Théophile Bastian Guillaume Iooss Christophe Guillon and 2 more

In a super-scalar architecture, the scheduler dynamically assigns micro-operations ($\mu$OPs) to execution ports. The port mapping of an architecture describes how instruction decomposes into $\mu$OPs and lists for each $\mu$OP set ports it can be mapped to. It is used by compilers performance debugging tools characterize throughput sequence instructions repeatedly executed as core component loop. This paper introduces dual equivalent representation: resource abstract model where, executed,...

10.48550/arxiv.2012.11473 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Coming Soon ...