NFDI4DS | UHH-SEMS - Publication Details

Filippo Spiga

ORCID: 0000-0003-1448-5304

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5070653336

Research Areas

Parallel Computing and Optimization Techniques
Advanced Data Storage Technologies
Distributed and Parallel Computing Systems
Scientific Computing and Data Management
Fluid Dynamics and Turbulent Flows
Lattice Boltzmann Simulation Studies
Fluid Dynamics Simulations and Interactions
Fluid Dynamics and Heat Transfer
Spacecraft and Cryogenic Technologies
Interconnection Networks and Systems
Computational Fluid Dynamics and Aerodynamics
Embedded Systems Design Techniques
Computational Physics and Python Applications
Matrix Theory and Algorithms
Gamma-ray bursts and supernovae
Particle Detector Development and Performance
Advanced Numerical Methods in Computational Mathematics
Nanopore and Nanochannel Transport Studies
Digital Transformation in Industry
Cell Image Analysis Techniques
Cardiac, Anesthesia and Surgical Outcomes
Industrial Vision Systems and Defect Detection
Aortic aneurysm repair treatments
Combustion and flame dynamics
Semiconductor materials and devices

Nvidia (United Kingdom)
2022-2025

Nvidia (United States)
2023

ARM (United Kingdom)
2018-2019

University of Cambridge
2015-2018

University of Turin
2018

Irish Centre for High-End Computing
2011-2012

University of Milano-Bicocca
2011

Strong scaling of general-purpose molecular dynamics simulations on GPUs

OPENALEX - Publications

Jens Gläser Trung Dac Nguyen Joshua A. Anderson Pak Lui Filippo Spiga and 3 more

10.1016/j.cpc.2015.02.028 article EN publisher-specific-oa Computer Physics Communications 2015-03-11

Quantum ESPRESSO: One Further Step toward the Exascale

OPENALEX - Publications

Ivan Carnimeo Fabio Affinito Stefano Baroni Oscar Baseggio Laura Bellentani and 6 more

We review the status of Quantum ESPRESSO software suite for electronic-structure calculations based on plane waves, pseudopotentials, and density-functional theory. highlight recent developments in porting to GPUs main codes, using an approach OpenACC CUDA Fortran offloading. describe, particular, results achieved linear-response which are one distinctive features suite. also present extensive performance benchmarks different GPU-accelerated architectures codes

10.1021/acs.jctc.3c00249 article EN cc-by Journal of Chemical Theory and Computation 2023-07-31

FluTAS: A GPU-accelerated finite difference code for multiphase flows

OPENALEX - Publications

Marco Crialesi-Esposito Nicolò Scapin Andreas D. Demou Marco Edoardo Rosti Pedro Costa and 2 more

We present the Fluid Transport Accelerated Solver, FluTAS, a scalable GPU code for multiphase flows with thermal effects. The solves incompressible Navier-Stokes equation two-fluid systems, direct FFT-based Poisson solver pressure equation. interface between two fluids is represented Volume of (VoF) method, which mass conserving and well suited complex thanks to its capacity handling topological changes. energy explicitly solved coupled momentum through Boussinesq approximation. conceived in...

10.1016/j.cpc.2022.108602 article EN cc-by Computer Physics Communications 2022-11-24

HPC4AI

OPENALEX - Publications

Marco Aldinucci Sergio Rabellino Marco Pironti Filippo Spiga Paolo Viviani and 16 more

In April 2018, under the auspices of POR-FESR 2014-2020 program Italian Piedmont Region, Turin's Centre on High-Performance Computing for Artificial Intelligence (HPC4AI) was funded with a capital investment 4.5M€ and it began its deployment. HPC4AI aims to facilitate scientific research engineering in areas Big Data Analytics. will specifically focus methods on-demand provisioning AI BDA Cloud services regional national industrial community, which includes large ecosystem Small-Medium...

10.1145/3203217.3205340 article EN 2018-05-08

NVIDIA Grace Superchip Early Evaluation for HPC Applications

OPENALEX - Publications

Fabio Banchelli Joan Vinyals-Ylla-Catala Josep Pocurull Marc Clascà Kilian Peiro and 3 more

Arm-based system in HPC are a reality since more than decade. However, when new chip enters the market always implies challenges, not only at ISA level, but also with regards to SoC integration, memory subsystem, board node interconnection, and finally OS all layers of software (compiler libraries). Guided by procurement an NVIDIA Grace cluster within deployment MareNostrum 5, emulating approach scientist who needs migrate its scientific research system, we evaluated five complex...

10.1145/3636480.3637284 article EN 2024-01-08

Accelerated linear algebra for large scale DFT calculations of materials on CPU/GPU architectures with CRYSTAL

OPENALEX - Publications

Giacomo Ambrogio Lorenzo Donà Jacques K. Desmarais Chiara Ribaldone Silvia Casassa and 3 more

We discuss the implementation strategy, numerical accuracy, and computational performance of acceleration linear algebra operations through graphics processing units (GPUs) for self-consistent field driver Crystal electronic structure package solid state density functional theory simulations. Accelerated tasks include matrix multiplication, diagonalization, inversion, as well Cholesky decomposition. The scaling implemented strategy over multiple accelerating devices is assessed in range 1–8...

10.1063/5.0250793 article EN The Journal of Chemical Physics 2025-02-25

Preliminary Study on Fine-Grained Power and Energy Measurements on Grace Hopper GH200 with Open-Source Performance Tools

OPENALEX - Publications

Óscar Hernández T. Wang Wael Elwasif Filippo Spiga Francesca Tartaglione and 2 more

10.1145/3703001.3724383 article EN 2025-02-19

A tale of two codes: CUDA vs OpenACC for mass-zero constrained dynamics

OPENALEX - Publications

Alessia Vignolo Taylor J. Baird Filippo Spiga Claudia Canevari Alessandro Coretti and 4 more

Speed and efficiency of codes for atomistic simulations can be improved through refactoring tailoring GPU architectures. This activity, however, comes with associated, often overlooked, costs, namely a reduced readability flexibility upon optimization non-negligible development time. The first element becomes particularly cogent when who carries out the code porting task is not creator algorithm. In this manuscript we investigate these issues by developing comparing CUDA (Compute Unified...

10.1177/10943420251331673 article EN The International Journal of High Performance Computing Applications 2025-04-18

phiGEMM: A CPU-GPU Library for Porting Quantum ESPRESSO on Hybrid Systems

OPENALEX - Publications

Filippo Spiga Ivan Girotto

GPU computing has revolutionized HPC by bringing the performance of supercomputer to desktop. Attractive price, performance, and power characteristics allow multiple GPUs be plugged into both desktop machines as well nodes for increased performance. Excellent scalability can achieved some problems using hybrid combinations CPU resources. This paper presents acceleration open-source Quantum ESPRESSO package with freely available phiGEMM library. Specifically, parallel implementation scaling...

10.1109/pdp.2012.72 article EN 2012-02-01

LBcuda: A high-performance CUDA port of LBsoft for simulation of colloidal systems

OPENALEX - Publications

Fabio Bonaccorso Marco Lauricella Andrea Montessori Giorgio Amati Massimo Bernaschi and 3 more

10.1016/j.cpc.2022.108380 article EN Computer Physics Communications 2022-04-27

Design of an Energy Aware Petaflops Class High Performance Cluster Based on Power Architecture

OPENALEX - Publications

Wissam Abu Ahmad Andrea Bartolini Francesco Beneventi Luca Benini Andrea Borghesi and 7 more

In this paper we present D.A.V.I.D.E. (Development for an Added Value Infrastructure Designed in Europe), innovative and energy efficient High Performance Computing cluster designed by E4 Computer Engineering PRACE (Partnership Advanced Europe). is built using best-in-class components (IBM's POWER8-NVLink CPUs, NVIDIA TESLA P100 GPUs, Mellanox InfiniBand EDR 100 Gb/s networking) plus custom hardware system middleware software. features (i) a dedicated power monitor interface, around the...

10.1109/ipdpsw.2017.22 preprint EN 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2017-05-01

SOD2D: A GPU-enabled Spectral Finite Elements Method for compressible scale-resolving simulations

OPENALEX - Publications

L. Gasparino Filippo Spiga O. Lehmkuhl

As new supercomputer architectures become more heavily focused on using hardware accelerators, in particular general-purpose graphical processors, it is therefore relevant that algorithms for computational fluid dynamics, especially those targeting scale-resolving simulations, be designed such a way as to make efficient use of hardware. In this paper, we propose one hardware-accelerated Continuous Galerkin Finite Elements model, aimed at handling simulations turbulent compressible flows over...

10.1016/j.cpc.2023.109067 article EN cc-by Computer Physics Communications 2023-12-21

CPU+GPU Programming of Stencil Computations for Resource-Efficient Use of GPU Clusters

OPENALEX - Publications

Mohammed Sourouri Johannes Langguth Filippo Spiga Scott B. Baden Xing Cai

On modern GPU clusters, the role of CPUs is often restricted to controlling GPUs and handling MPI communication. The unused computing power CPUs, however, can be considerable for computations whose performance bounded by memory traffic. This paper investigates challenges simultaneous usage computation. Our emphasis on deriving a heterogeneous CPU+GPU programming approach that combines MPI, OpenMP CUDA. To effectively hide overhead various inter-and intra-node communications, new level task...

10.1109/cse.2015.33 article EN 2015-10-01

Neural network fusion: a novel CT-MR aortic aneurysm image segmentation method

OPENALEX - Publications

Duo Wang Rui Zhang Zhongzhao Teng Yuan Huang Filippo Spiga and 5 more

Medical imaging examination on patients usually involves more than one modalities, such as Computed Tomography (CT), Magnetic Resonance (MR) and Positron Emission Tomography(PET) imaging. Multimodal allows examiners to benefit from the advantage of each modalities. For example, for Abdominal Aortic Aneurysm, CT shows calcium deposits in aorta clearly while MR distinguishes thrombus soft tissues better.1 Analysing segmenting both images combine results will greatly help radiologists doctors...

10.1117/12.2293371 article EN Medical Imaging 2022: Image Processing 2018-03-02

Application Experiences on a GPU-Accelerated Arm-based HPC Testbed

OPENALEX - Publications

Wael Elwasif William F. Godoy Nick Hagerty J. Austin Harris Óscar Hernández and 29 more

This paper assesses and reports the experience of ten teams working to port, validate, benchmark several High Performance Computing applications on a novel GPU-accelerated Arm testbed system. The consists eight NVIDIA HPC Developer Kit systems, each one equipped with server-class CPU from Ampere two data center GPUs Corp. systems are connected together using InfiniBand interconnect. selected mini-apps written programming languages use multiple accelerator-based models for such as CUDA,...

10.1145/3581576.3581621 article EN 2023-02-03

Exploring GPU-to-GPU Communication: Insights into Supercomputer Interconnects

OPENALEX - Publications

Daniele De Sensi Lorenzo Pichetti Flavio Vella Tiziano De Matteis Zebin Ren and 9 more

10.1109/sc41406.2024.00039 article EN 2024-11-17

The 2DECOMP&FFT library: an update with new CPU/GPU capabilities

OPENALEX - Publications

Stefano Rolfo Cédric Flageul Paul Bartholomew Filippo Spiga Sylvain Laizet

The 2DECOMP&FFT library is a software framework written in modern Fortran to build largescale parallel applications.It designed for applications using three-dimensional structured meshes with particular focus on spatially implicit numerical algorithms.However, the can be easily used other discretisation schemes based layout and where pencil decomposition apply.It general-purpose 2D data distribution Input Output (I/O).A 1D slab also available as special case of decomposition.The includes...

10.21105/joss.05813 article EN cc-by The Journal of Open Source Software 2023-11-21

CMS Distributed Computing Integration in the LHC sustained operations era

OPENALEX - Publications

C. Grandi Brian Bockelman D. Bonacorsi I. Fisk I. González Caballero and 10 more

After many years of preparation the CMS computing system has reached a situation where stability in operations limits possibility to introduce innovative features.Nevertheless it is same need and smooth that requires introduction features were considered not strategic previous phases.Examples are: adequate authorization control prioritize access storage resources; improved monitoring investigate problems identify bottlenecks on infrastructure; increased automation reduce manpower needed for...

10.1088/1742-6596/331/6/062032 article EN Journal of Physics Conference Series 2011-12-23

Open-Source Shared Memory implementation of the HPCG benchmark: analysis, improvements and evaluation on Cavium ThunderX2

OPENALEX - Publications

Daniel Ruiz Filippo Spiga Marc Casas Marta García-Gasulla Filippo Mantovani

The High Performance Conjugate Gradient (HPCG) benchmark complements the LINPACK in performance evaluation coverage of large Computing (HPC) systems. Due to its lower arithmetic intensity and higher memory pressure, HPCG is recognized as a more representative for data-center irregular access pattern workloads, therefore popularity has been steadily raising within HPC community. As only small fraction reference version parallelized with shared techniques (OpenMP), this paper we introduce...

10.1109/hpcs48598.2019.9188103 article EN 2019-07-01

Effects of Rayleigh and Weber numbers on two-layer turbulent Rayleigh–Bénard convection

OPENALEX - Publications

Andreas D. Demou Nicolò Scapin Marco Crialesi-Esposito Pedro Costa Filippo Spiga and 1 more

This study presents direct numerical simulation results of two-layer Rayleigh–Bénard convection, investigating the previously unexplored Rayleigh–Weber parameter space $10^6\leq Ra\leq 10^8$ and $10^2\leq We\leq 10^3$ . Global properties, such as Nusselt Reynolds numbers, are compared against extended Grossmann–Lohse theory for two fluid layers, confirming a weak Weber number dependence all global quantities considerably larger numbers in lighter fluid. Statistics flow reveal that interface...

10.1017/jfm.2024.805 article EN Journal of Fluid Mechanics 2024-10-02

Exploring GPU-to-GPU Communication: Insights into Supercomputer Interconnects

OPENALEX - Publications

Daniele De Sensi Lorenzo Pichetti Flavio Vella Tiziano De Matteis Zebin Ren and 9 more

Multi-GPU nodes are increasingly common in the rapidly evolving landscape of exascale supercomputers. On these systems, GPUs on same node connected through dedicated networks, with bandwidths up to a few terabits per second. However, gauging performance expectations and maximizing system efficiency is challenging due different technologies, design options, software layers. This paper comprehensively characterizes three supercomputers - Alps, Leonardo, LUMI each unique architecture design. We...

10.1109/sc41406.2024.00039 preprint EN arXiv (Cornell University) 2024-08-26

Gpu-Accelerated Quantum Espresso

OPENALEX - Publications

Massimiliano Fatica Josh Romero Everett Phillips Filippo Spiga

This is first production version of QE-GPU, not all packages and functionalities are fully supported. At present, only PWscf k-point calculations GPU accelerated. based on the QE v6.1 codebase. Please refer to README.md for additional information. If you find a specific reproducible problem please open GitHub issue. Just remind that issues feature requests. If have requests or input cases test email directly Filippo Spiga

10.5281/zenodo.1041825 article EN 2017-11-03

Coming Soon ...