- Parallel Computing and Optimization Techniques
- Superconducting Materials and Applications
- Cloud Computing and Resource Management
- Embedded Systems Design Techniques
- Advanced Data Storage Technologies
- Distributed systems and fault tolerance
- Radiation Effects in Electronics
- Distributed and Parallel Computing Systems
- Service-Oriented Architecture and Web Services
- Mobile Agent-Based Network Management
- Interconnection Networks and Systems
- Ferroelectric and Negative Capacitance Devices
- Software System Performance and Reliability
Département d'Informatique
2024
Télécom Paris
2024
Institut Polytechnique de Paris
2024
Telecom SudParis
2024
Colorado State University
2021-2022
Université Grenoble Alpes
2022
Institut polytechnique de Grenoble
2022
Laboratoire d'Informatique de Grenoble
2022
Centre Inria de l'Université Grenoble Alpes
2022
Centre National de la Recherche Scientifique
2022
The open-source and community-supported gem5 simulator is one of the most popular tools for computer architecture research. This simulation infrastructure allows researchers to model modern hardware at cycle level, it has enough fidelity boot unmodified Linux-based operating systems run full applications multiple architectures including x86, Arm, RISC-V. been under active development over last nine years since original release. In this time, there have 7500 commits codebase from 250 unique...
In a super-scalar architecture, the scheduler dynamically assigns micro-operations $( \mu$ OPs) to execution ports. The port mapping of an architecture describes how instruction decomposes into $\mu$ OPs and lists for each OP set ports it can be mapped to. It is used by compilers performance debugging tools characterize throughput sequence instructions repeatedly executed as core component loop.This paper introduces dual equivalent representation: resource abstract model where, executed,...
Memory bandwidth is known to be a performance bottleneck for FPGA accelerators, especially when they deal with large multi-dimensional data-sets. A body of work focuses on reducing off-chip transfers, but few authors try improve the efficiency transfers. This paper addresses later issue by proposing (i) compiler-based approach accelerator's data layout maximize contiguous access memory, and (ii) packing runtime compression techniques that take advantage this further memory performance. We...
Modern Out-of-Order (OoO) CPUs are complex systems with many components interleaved in non-trivial ways. Pinpointing performance bottlenecks and understanding the underlying causes of program issues critical tasks to fully exploit offered by hardware resources. Current debugging approaches rely either on measuring resource utilization, order estimate which parts a CPU induce limitations, or code-based analysis deriving bottleneck information from capacity/throughput models. These limited...
In a super-scalar architecture, the scheduler dynamically assigns micro-operations ($\mu$OPs) to execution ports. The port mapping of an architecture describes how instruction decomposes into $\mu$OPs and lists for each $\mu$OP set ports it can be mapped to. It is used by compilers performance debugging tools characterize throughput sequence instructions repeatedly executed as core component loop. This paper introduces dual equivalent representation: resource abstract model where, executed,...