Michalis Gianioudis
- Parallel Computing and Optimization Techniques
- Interconnection Networks and Systems
- Embedded Systems Design Techniques
- Distributed systems and fault tolerance
- Advanced Optical Network Technologies
- Radiation Effects in Electronics
- Distributed and Parallel Computing Systems
- Advanced Data Storage Technologies
- E-commerce and Technology Innovations
Foundation for Research and Technology Hellas
2022-2025
FORTH Institute of Computer Science
2022
Universitat Politècnica de València
2020
We present and evaluate the ExaNeSt Prototype, which compactly packages 128 Xilinx ZU9EG MPSoCs, 2 TBytes of DRAM, 8 SSD into a liquid-cooled rack, using custom interconnection hardware based on 10 Gbps links. developed this testbed in 2016-2019 order to leverage flexibility FPGAs for experimenting with efficient support HPC communication among tens thousands processors accelerators quest towards Exascale systems beyond. In years since then, we carefully studied system, our key design...
Remote Direct Memory Access (RDMA) is widely used in High-Performance Computing (HPC) while making inroads datacenters and accelerators. State-of-the-art RDMA engines typically do not endure page faults, therefore users are forced to pin their buffers, which complicates the programming model, limits memory utilization, moves pressure Network Interface Cards (NICs). In this article we introduce a mechanism for handling dynamic faults during RDMA, named PART, suitable emerging processors that...
In order to enable Exascale computing, next generation interconnection networks must scale hundreds of thousands nodes, and provide features also allow the HPC, HPDA, AI applications reach Exascale, while benefiting from new hardware software trends. RED-SEA will pave way European interconnects, including BXI, as follows: (i) specify architecture using hardware-software co-design a set representative terrain converging AI; (ii) test, evaluate, and/or implement architectural at multiple...
Low-latency inter-node communication is important in HPC clusters. In this work, we design and integrate a low-cost interconnect, capable for low-latency user-level with open-source RISC-V processors, obviating the need bulky expensive network interface cards connected over PCI. Our lean next to Load/Store (LD/ST) stage of processor, which modify achieve back-to-back stores address range dedicated NI. The primitives that examine are suitable many-to-one optimized small messages, while...