NFDI4DS | UHH-SEMS - Publication Details

Salvatore Di Girolamo

ORCID: 0000-0003-2197-8860

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5019905167

Research Areas

Interconnection Networks and Systems
Software-Defined Networks and 5G
Parallel Computing and Optimization Techniques
Advanced Data Storage Technologies
Cloud Computing and Resource Management
Photonic and Optical Devices
Photorefractive and Nonlinear Optics
Advanced Optical Network Technologies
Advanced Fiber Laser Technologies
Caching and Content Delivery
Advanced Fiber Optic Sensors
Advanced Memory and Neural Computing
Distributed and Parallel Computing Systems
Complex Network Analysis Techniques
Semiconductor Lasers and Optical Devices
Software System Performance and Reliability
Graph Theory and Algorithms
Network Packet Processing and Optimization
Peer-to-Peer Network Technologies
Network Traffic and Congestion Control
Nuclear Physics and Applications
IoT and Edge/Fog Computing
Scientific Computing and Data Management
SARS-CoV-2 and COVID-19 Research
Advanced Graph Neural Networks

ETH Zurich
2015-2024

Tamedia (Switzerland)
2024

Zürcher Fachhochschule
2022

Technical University of Darmstadt
2022

University of Illinois Urbana-Champaign
2022

Indian Institute of Technology Kanpur
2022

Università della Svizzera italiana
2021

Board of the Swiss Federal Institutes of Technology
2017

University of Pisa
2015-2016

University of Eastern Finland
2006-2013

Kilometer-Scale Climate Models: Prospects and Challenges

OPENALEX - Publications

Christoph Schär Oliver Fuhrer Andrea Arteaga Nikolina Ban Christophe Charpilloz and 13 more

Currently major efforts are underway toward refining the horizontal resolution (or grid spacing) of climate models to about 1 km, using both global and regional (GCMs RCMs). Several groups have succeeded in conducting kilometer-scale multiweek GCM simulations decadelong continental-scale RCM simulations. There is well-founded hope that this increase represents a quantum jump modeling, as it enables replacing parameterization moist convection by an explicit treatment. It expected will improve...

10.1175/bams-d-18-0167.1 article EN Bulletin of the American Meteorological Society 2019-10-25

Bispecific IgG neutralizes SARS-CoV-2 variants and prevents escape in mice

OPENALEX - Publications

Raoul De Gasparo Mattia Pedotti Luca Simonelli Petr Nickl Frauke Muecksch and 42 more

Neutralizing antibodies that target the receptor-binding domain (RBD) of SARS-CoV-2 spike protein are among most promising approaches against COVID-191,2. A bispecific IgG1-like molecule (CoV-X2) has been developed on basis C121 and C135, two derived from donors who had recovered COVID-193. Here we show CoV-X2 simultaneously binds independent sites RBD and, unlike its parental antibodies, prevents detectable binding to cellular receptor virus, angiotensin-converting enzyme 2 (ACE2)....

10.1038/s41586-021-03461-y article EN other-oa Nature 2021-03-25

An In-Depth Analysis of the Slingshot Interconnect

OPENALEX - Publications

Daniele De Sensi Salvatore Di Girolamo Kim H. McMahon Duncan Roweth Torsten Hoefler

The interconnect is one of the most critical components in large scale computing systems, and its impact on performance applications going to increase with system size. In this paper, we will describe SLINGSHOT, an interconnection network for systems. SLINGSHOT based high-radix switches, which allow building exascale hyper-scale datacenters networks at three switch-to-switch hops. Moreover, provides efficient adaptive routing congestion control algorithms, highly tunable traffic classes....

10.1109/sc41405.2020.00039 preprint EN 2020-11-01

SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems

OPENALEX - Publications

Maciej Besta Raghavendra Kanakagiri Grzegorz Kwaśniewski Rachata Ausavarungnirun Jakub Beránek and 14 more

Simple graph algorithms such as PageRank have been the target of numerous hardware accelerators. Yet, there also exist much more complex mining for problems clustering or maximal clique listing. These are memory-bound and thus could be accelerated by techniques Processing-in-Memory (PIM). However, they come with non-straightforward parallelism complicated memory access patterns. In this work, we address problem a simple yet surprisingly powerful observation: operations on sets vertices,...

10.1145/3466752.3480133 article EN 2021-10-17

sPIN

OPENALEX - Publications

Torsten Hoefler Salvatore Di Girolamo Konstantin Taranov Ryan E. Grant Ron Brightwell

Optimizing communication performance is imperative for large-scale computing because overheads limit the strong scalability of parallel applications. Today's network cards contain rather powerful processors optimized data movement. However, these devices are limited to fixed functions, such as remote direct memory access. We develop sPIN, a portable programming model offload simple packet processing functions card. To demonstrate potential model, we design cycle-accurate simulation...

10.1145/3126908.3126970 article EN 2017-11-08

High-Performance Routing With Multipathing and Path Diversity in Ethernet and HPC Networks

OPENALEX - Publications

Maciej Besta Jens Domke Marcel Schneider Marek Konieczny Salvatore Di Girolamo and 3 more

The recent line of research into topology design focuses on lowering network diameter. Many low-diameter topologies such as Slim Fly or Jellyfish that substantially reduce cost, power consumption, and latency have been proposed. A key challenge in realizing the benefits these is routing. On one hand, networks provide shorter path lengths than established Clos torus, leading to performance improvements. other number shortest paths between each pair endpoints much smaller Clos, but there a...

10.1109/tpds.2020.3035761 article EN IEEE Transactions on Parallel and Distributed Systems 2020-11-04

Taming unbalanced training workloads in deep learning with partial collective operations

OPENALEX - Publications

Shigang Li Tal Ben‐Nun Salvatore Di Girolamo Dan Alistarh Torsten Hoefler

Load imbalance pervasively exists in distributed deep learning training systems, either caused by the inherent learned tasks or system itself. Traditional synchronous Stochastic Gradient Descent (SGD) achieves good accuracy for a wide variety of tasks, but relies on global synchronization to accumulate gradients at every step. In this paper, we propose eager-SGD, which relaxes decentralized accumulation. To implement use two partial collectives: solo and majority. With allreduce, faster...

10.1145/3332466.3374528 preprint EN 2020-02-19

Flare

OPENALEX - Publications

Daniele De Sensi Salvatore Di Girolamo Saleh Ashkboos Shigang Li Torsten Hoefler

The allreduce operation is one of the most commonly used communication routines in distributed applications. To improve its bandwidth and to reduce network traffic, this can be accelerated by offloading it switches, that aggregate data received from hosts, send them back aggregated result. However, existing solutions provide limited customization opportunities might suboptimal performance when dealing with custom operators types, sparse data, or reproducibility aggregation a concern. deal...

10.1145/3458817.3476178 preprint EN 2021-10-21

Mitigating network noise on Dragonfly networks through application-aware routing

OPENALEX - Publications

Daniele De Sensi Salvatore Di Girolamo Torsten Hoefler

System noise can negatively impact the performance of HPC systems, and interconnection network is one main factors contributing to this problem. To mitigate effect, adaptive routing sends packets on non-minimal paths if they are less congested. However, while may interference caused by congestion, it also generates more traffic since traverse additional hops, causing in turn congestion other applications application itself. In paper, we first describe how estimate noise. By following these...

10.1145/3295500.3356196 preprint EN 2019-11-07

A RISC-V in-network accelerator for flexible high-performance low-power packet processing

OPENALEX - Publications

Salvatore Di Girolamo Andreas Kurth Alexandru Calotoiu Thomas Benz Timo Schneider and 3 more

The capacity of offloading data and control tasks to the network is becoming increasingly important, especially if we consider faster growth speed when compared CPU frequencies. In-network compute alleviates host load by running directly in network, enabling additional computation/communication overlap potentially improving overall application performance. However, sustaining bandwidths provided next-generation networks, e.g., 400 Gbit/s, can become a challenge. sPIN programming model for...

10.1109/isca52012.2021.00079 article EN 2021-06-01

Fast adaptive interferometer on dynamic reflection hologram in CdTe:V

OPENALEX - Publications

Salvatore Di Girolamo Alexei A. Kamshilin Roman V. Romashko Yuriy N. Kulchin J.C. Launay

We present an adaptive interferometer based on the reflection dynamic hologram recorded in photorefractive CdTe:V crystal with no external electric field. Linear phase-to-intensity transformation is achieved by vectorial mixing of two waves different polarization states (linear and elliptical) anisotropic diffraction geometry. Comparison transmission geometries considering both sensitivity adaptability carried out. It shown that geometry characterized better combination these parameters...

10.1364/oe.15.000545 article EN cc-by Optics Express 2007-01-22

FatPaths: Routing in Supercomputers and Data Centers when Shortest Paths Fall Short

OPENALEX - Publications

Maciej Besta Marcel Schneider Marek Konieczny Karolina Cynk Erik Henriksson and 3 more

We introduce FatPaths: a simple, generic, and robust routing architecture that enables state-of-the-art low-diameter topologies such as Slim Fly to achieve unprecedented performance. FatPaths targets Ethernet stacks in both HPC supercomputers well cloud data centers clusters. exposes exploits the rich ("fat") diversity of minimal non-minimal paths for high-performance multi-pathing. Moreover, uses redesigned "purified" transport layer removes virtually all TCP performance issues (e.g., slow...

10.1109/sc41405.2020.00031 article EN 2020-11-01

CoRM

OPENALEX - Publications

Konstantin Taranov Salvatore Di Girolamo Torsten Hoefler

Distributed memory systems are becoming increasingly important since they provide a system-scale abstraction where physically separated memories can be addressed as single logical one. This enables disaggregation, allowing in-memory databases, caching services, and ephemeral storage to naturally deployed at large scales. While this effectively increases the capacity of these systems, it faces additional overheads for remote accesses. To narrow difference between local accesses, low latency...

10.1145/3448016.3452817 article EN Proceedings of the 2022 International Conference on Management of Data 2021-06-09

Orthogonal geometry of wave interaction in a photorefractive crystal for linear phase demodulation

OPENALEX - Publications

Salvatore Di Girolamo Roman V. Romashko Yuri N. Kulchin Alexei A. Kamshilin

10.1016/j.optcom.2009.09.035 article EN Optics Communications 2009-10-09

Exploiting Offload Enabled Network Interfaces

OPENALEX - Publications

Salvatore Di Girolamo Pierre Jolivet Keith D. Underwood Torsten Hoefler

Network interface cards are one of the key components to achieve efficient parallel performance. In past, they have gained new functionalities such as lossless transmissionand remote direct memory access that now ubiquitous in high-performance systems. Prototypes next generation network offer features facilitate device programming. this work, various possible uses offload explored. We use Portals 4 specification an example demonstrate techniques fully asynchronous, multi-schedule and solo...

10.1109/hoti.2015.21 article EN 2015-08-01

Photorefractive vectorial wave mixing in different geometries

OPENALEX - Publications

Roman V. Romashko Salvatore Di Girolamo Yuri N. Kulchin Alexei A. Kamshilin

We analyze vectorial wave mixing in a photorefractive crystal of cubic symmetry different geometries beam interactions--reflection, transmission, and orthogonal. It is shown that orthogonal geometry contrast with others supports an efficient phase demodulation depolarized object linear mode without using any polarization-filtering elements. As result adaptive interferometers based on the can provide higher signal-to-noise ratio due to lower noise optical losses.

10.1364/josab.27.000311 article EN Journal of the Optical Society of America B 2010-01-22

Continuous skyline queries on multicore architectures

OPENALEX - Publications

Tiziano De Matteis Salvatore Di Girolamo Gabriele Mencagli

Summary The emergence of real‐time decision‐making applications in domains like high‐frequency trading, emergency management, and service level analysis communication networks has led to the definition new classes queries. Skyline queries are a notable example. Their results consist all tuples whose attribute vector is not dominated (in Pareto sense) by one any other tuple. Because their popularity, skyline have been studied terms both sequential algorithms parallel implementations for...

10.1002/cpe.3866 article EN Concurrency and Computation Practice and Experience 2016-05-19

Noise in the Clouds

OPENALEX - Publications

Daniele De Sensi Tiziano De Matteis Konstantin Taranov Salvatore Di Girolamo Tobias Rahn and 1 more

Cloud computing represents an appealing opportunity for cost-effective deployment of HPC workloads on the best-fitting hardware. However, although cloud and on-premise systems offer similar computational resources, their network architecture performance may differ significantly. For example, these use fundamentally different transport routing protocols, which introduce noise that can eventually limit application scaling. This work analyzes performance, scalability, cost running systems....

10.1145/3570609 article EN Proceedings of the ACM on Measurement and Analysis of Computing Systems 2022-12-01

Exploring GPU-to-GPU Communication: Insights into Supercomputer Interconnects

OPENALEX - Publications

Daniele De Sensi Lorenzo Pichetti Flavio Vella Tiziano De Matteis Zebin Ren and 9 more

10.1109/sc41406.2024.00039 article EN 2024-11-17

Sensing of multimode-fiber strain by a dynamic photorefractive hologram

OPENALEX - Publications

Salvatore Di Girolamo Alexei A. Kamshilin Roman V. Romashko Yuriy N. Kulchin Jean C. Launay

We present a strain sensor in which multimode fiber is used as sensitive element. High sensitivity to dynamic strains achieved by means of vectorial wave mixing photorefractive CdTe:V crystal. It was found that the largest source noise our related instability polarization state speckles emerging from fiber. This significantly diminished with core large diameter (550 microm).

10.1364/ol.32.001821 article EN Optics Letters 2007-06-20

Network-accelerated non-contiguous memory transfers

OPENALEX - Publications

Salvatore Di Girolamo Konstantin Taranov Andreas Kurth Michael Schaffner Timo Schneider and 5 more

Applications often communicate data that is non-contiguous in the send- or receive-buffer, e.g., when exchanging a column of matrix stored row-major order. While transfers are well supported HPC (e.g., MPI derived datatypes), they can still be up to 5x slower than contiguous same size. As we enter era network acceleration, need investigate which tasks offload NIC: In this work argue memory transparently networkaccelerated, truly achieving zero-copy communications. We implement and extend...

10.1145/3295500.3356189 preprint EN 2019-11-07

HammingMesh: A Network Topology for Large-Scale Deep Learning

OPENALEX - Publications

Torsten Hoefler Tommaso Bonato Daniele De Sensi Salvatore Di Girolamo Shigang Li and 5 more

Numerous microarchitectural optimizations unlocked tremendous processing power for deep neural networks that in turn fueled the AI revolution. With exhaustion of such optimizations, growth modern is now gated by performance training systems, especially their data movement. Instead focusing on single accelerators, we investigate data-movement characteristics large-scale at full system scale. Based our workload analysis, design HammingMesh, a novel network topology provides high bandwidth low...

10.1109/sc41404.2022.00016 article EN 2022-11-01

sPIN: High-performance streaming Processing in the Network

OPENALEX - Publications

Torsten Hoefler Salvatore Di Girolamo Konstantin Taranov Ryan E. Grant Ronald B. Brightwell

10.48550/arxiv.1709.05483 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Exploiting Offload-Enabled Network Interfaces

OPENALEX - Publications

Salvatore Di Girolamo Pierre Jolivet Keith D. Underwood Torsten Hoefler

Network interface cards are one of the key components to achieve efficient parallel performance. In past, they have gained new functionalities, such as lossless transmission and remote direct memory access, that now ubiquitous in high-performance systems. Prototypes next-generation network offer features facilitate device programming. this article, authors discuss an abstract machine model for offloading architectures. They used Portals 4 implement proposed abstraction model, present two...

10.1109/mm.2016.56 article EN IEEE Micro 2016-07-01

Coming Soon ...