Filippo Spiga

ORCID: 0000-0003-1448-5304
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Parallel Computing and Optimization Techniques
  • Advanced Data Storage Technologies
  • Distributed and Parallel Computing Systems
  • Scientific Computing and Data Management
  • Fluid Dynamics and Turbulent Flows
  • Lattice Boltzmann Simulation Studies
  • Fluid Dynamics Simulations and Interactions
  • Fluid Dynamics and Heat Transfer
  • Spacecraft and Cryogenic Technologies
  • Interconnection Networks and Systems
  • Computational Fluid Dynamics and Aerodynamics
  • Embedded Systems Design Techniques
  • Computational Physics and Python Applications
  • Matrix Theory and Algorithms
  • Gamma-ray bursts and supernovae
  • Particle Detector Development and Performance
  • Advanced Numerical Methods in Computational Mathematics
  • Nanopore and Nanochannel Transport Studies
  • Digital Transformation in Industry
  • Cell Image Analysis Techniques
  • Cardiac, Anesthesia and Surgical Outcomes
  • Industrial Vision Systems and Defect Detection
  • Aortic aneurysm repair treatments
  • Combustion and flame dynamics
  • Semiconductor materials and devices

Nvidia (United Kingdom)
2022-2025

Nvidia (United States)
2023

ARM (United Kingdom)
2018-2019

University of Cambridge
2015-2018

University of Turin
2018

Irish Centre for High-End Computing
2011-2012

University of Milano-Bicocca
2011

We review the status of Quantum ESPRESSO software suite for electronic-structure calculations based on plane waves, pseudopotentials, and density-functional theory. highlight recent developments in porting to GPUs main codes, using an approach OpenACC CUDA Fortran offloading. describe, particular, results achieved linear-response which are one distinctive features suite. also present extensive performance benchmarks different GPU-accelerated architectures codes

10.1021/acs.jctc.3c00249 article EN cc-by Journal of Chemical Theory and Computation 2023-07-31

We present the Fluid Transport Accelerated Solver, FluTAS, a scalable GPU code for multiphase flows with thermal effects. The solves incompressible Navier-Stokes equation two-fluid systems, direct FFT-based Poisson solver pressure equation. interface between two fluids is represented Volume of (VoF) method, which mass conserving and well suited complex thanks to its capacity handling topological changes. energy explicitly solved coupled momentum through Boussinesq approximation. conceived in...

10.1016/j.cpc.2022.108602 article EN cc-by Computer Physics Communications 2022-11-24

In April 2018, under the auspices of POR-FESR 2014-2020 program Italian Piedmont Region, Turin's Centre on High-Performance Computing for Artificial Intelligence (HPC4AI) was funded with a capital investment 4.5M€ and it began its deployment. HPC4AI aims to facilitate scientific research engineering in areas Big Data Analytics. will specifically focus methods on-demand provisioning AI BDA Cloud services regional national industrial community, which includes large ecosystem Small-Medium...

10.1145/3203217.3205340 article EN 2018-05-08

Arm-based system in HPC are a reality since more than decade. However, when new chip enters the market always implies challenges, not only at ISA level, but also with regards to SoC integration, memory subsystem, board node interconnection, and finally OS all layers of software (compiler libraries). Guided by procurement an NVIDIA Grace cluster within deployment MareNostrum 5, emulating approach scientist who needs migrate its scientific research system, we evaluated five complex...

10.1145/3636480.3637284 article EN 2024-01-08

We discuss the implementation strategy, numerical accuracy, and computational performance of acceleration linear algebra operations through graphics processing units (GPUs) for self-consistent field driver Crystal electronic structure package solid state density functional theory simulations. Accelerated tasks include matrix multiplication, diagonalization, inversion, as well Cholesky decomposition. The scaling implemented strategy over multiple accelerating devices is assessed in range 1–8...

10.1063/5.0250793 article EN The Journal of Chemical Physics 2025-02-25

Speed and efficiency of codes for atomistic simulations can be improved through refactoring tailoring GPU architectures. This activity, however, comes with associated, often overlooked, costs, namely a reduced readability flexibility upon optimization non-negligible development time. The first element becomes particularly cogent when who carries out the code porting task is not creator algorithm. In this manuscript we investigate these issues by developing comparing CUDA (Compute Unified...

10.1177/10943420251331673 article EN The International Journal of High Performance Computing Applications 2025-04-18

GPU computing has revolutionized HPC by bringing the performance of supercomputer to desktop. Attractive price, performance, and power characteristics allow multiple GPUs be plugged into both desktop machines as well nodes for increased performance. Excellent scalability can achieved some problems using hybrid combinations CPU resources. This paper presents acceleration open-source Quantum ESPRESSO package with freely available phiGEMM library. Specifically, parallel implementation scaling...

10.1109/pdp.2012.72 article EN 2012-02-01

In this paper we present D.A.V.I.D.E. (Development for an Added Value Infrastructure Designed in Europe), innovative and energy efficient High Performance Computing cluster designed by E4 Computer Engineering PRACE (Partnership Advanced Europe). is built using best-in-class components (IBM's POWER8-NVLink CPUs, NVIDIA TESLA P100 GPUs, Mellanox InfiniBand EDR 100 Gb/s networking) plus custom hardware system middleware software. features (i) a dedicated power monitor interface, around the...

10.1109/ipdpsw.2017.22 preprint EN 2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2017-05-01

As new supercomputer architectures become more heavily focused on using hardware accelerators, in particular general-purpose graphical processors, it is therefore relevant that algorithms for computational fluid dynamics, especially those targeting scale-resolving simulations, be designed such a way as to make efficient use of hardware. In this paper, we propose one hardware-accelerated Continuous Galerkin Finite Elements model, aimed at handling simulations turbulent compressible flows over...

10.1016/j.cpc.2023.109067 article EN cc-by Computer Physics Communications 2023-12-21

On modern GPU clusters, the role of CPUs is often restricted to controlling GPUs and handling MPI communication. The unused computing power CPUs, however, can be considerable for computations whose performance bounded by memory traffic. This paper investigates challenges simultaneous usage computation. Our emphasis on deriving a heterogeneous CPU+GPU programming approach that combines MPI, OpenMP CUDA. To effectively hide overhead various inter-and intra-node communications, new level task...

10.1109/cse.2015.33 article EN 2015-10-01

Medical imaging examination on patients usually involves more than one modalities, such as Computed Tomography (CT), Magnetic Resonance (MR) and Positron Emission Tomography(PET) imaging. Multimodal allows examiners to benefit from the advantage of each modalities. For example, for Abdominal Aortic Aneurysm, CT shows calcium deposits in aorta clearly while MR distinguishes thrombus soft tissues better.1 Analysing segmenting both images combine results will greatly help radiologists doctors...

10.1117/12.2293371 article EN Medical Imaging 2022: Image Processing 2018-03-02

This paper assesses and reports the experience of ten teams working to port, validate, benchmark several High Performance Computing applications on a novel GPU-accelerated Arm testbed system. The consists eight NVIDIA HPC Developer Kit systems, each one equipped with server-class CPU from Ampere two data center GPUs Corp. systems are connected together using InfiniBand interconnect. selected mini-apps written programming languages use multiple accelerator-based models for such as CUDA,...

10.1145/3581576.3581621 article EN 2023-02-03

The 2DECOMP&FFT library is a software framework written in modern Fortran to build largescale parallel applications.It designed for applications using three-dimensional structured meshes with particular focus on spatially implicit numerical algorithms.However, the can be easily used other discretisation schemes based layout and where pencil decomposition apply.It general-purpose 2D data distribution Input Output (I/O).A 1D slab also available as special case of decomposition.The includes...

10.21105/joss.05813 article EN cc-by The Journal of Open Source Software 2023-11-21

After many years of preparation the CMS computing system has reached a situation where stability in operations limits possibility to introduce innovative features.Nevertheless it is same need and smooth that requires introduction features were considered not strategic previous phases.Examples are: adequate authorization control prioritize access storage resources; improved monitoring investigate problems identify bottlenecks on infrastructure; increased automation reduce manpower needed for...

10.1088/1742-6596/331/6/062032 article EN Journal of Physics Conference Series 2011-12-23

The High Performance Conjugate Gradient (HPCG) benchmark complements the LINPACK in performance evaluation coverage of large Computing (HPC) systems. Due to its lower arithmetic intensity and higher memory pressure, HPCG is recognized as a more representative for data-center irregular access pattern workloads, therefore popularity has been steadily raising within HPC community. As only small fraction reference version parallelized with shared techniques (OpenMP), this paper we introduce...

10.1109/hpcs48598.2019.9188103 article EN 2019-07-01

This study presents direct numerical simulation results of two-layer Rayleigh–Bénard convection, investigating the previously unexplored Rayleigh–Weber parameter space $10^6\leq Ra\leq 10^8$ and $10^2\leq We\leq 10^3$ . Global properties, such as Nusselt Reynolds numbers, are compared against extended Grossmann–Lohse theory for two fluid layers, confirming a weak Weber number dependence all global quantities considerably larger numbers in lighter fluid. Statistics flow reveal that interface...

10.1017/jfm.2024.805 article EN Journal of Fluid Mechanics 2024-10-02

Multi-GPU nodes are increasingly common in the rapidly evolving landscape of exascale supercomputers. On these systems, GPUs on same node connected through dedicated networks, with bandwidths up to a few terabits per second. However, gauging performance expectations and maximizing system efficiency is challenging due different technologies, design options, software layers. This paper comprehensively characterizes three supercomputers - Alps, Leonardo, LUMI each unique architecture design. We...

10.1109/sc41406.2024.00039 preprint EN arXiv (Cornell University) 2024-08-26

This is first production version of QE-GPU, not all packages and functionalities are fully supported. At present, only PWscf k-point calculations GPU accelerated. based on the QE v6.1 codebase. Please refer to README.md for additional information. If you find a specific reproducible problem please open GitHub issue. Just remind that issues feature requests. If have requests or input cases test email directly Filippo Spiga

10.5281/zenodo.1041825 article EN 2017-11-03
Coming Soon ...