NFDI4DS | UHH-SEMS - Publication Details

Shay Vargaftik

ORCID: 0000-0002-0982-7894

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5078049483

Research Areas

Cloud Computing and Resource Management
Network Security and Intrusion Detection
Distributed and Parallel Computing Systems
Privacy-Preserving Technologies in Data
Software-Defined Networks and 5G
Internet Traffic Analysis and Secure E-voting
Network Traffic and Congestion Control
Stochastic Gradient Optimization Techniques
Sparse and Compressive Sensing Techniques
Advanced Queuing Theory Analysis
Distributed Sensor Networks and Detection Algorithms
Anomaly Detection Techniques and Applications
Software System Performance and Reliability
Advanced Optical Network Technologies
Advanced Wireless Network Optimization
Distributed systems and fault tolerance
Network Packet Processing and Optimization
Data Stream Mining Techniques
Cryptography and Data Security
Parallel Computing and Optimization Techniques
Age of Information Optimization
Interconnection Networks and Systems
Advanced Malware Detection Techniques
Advanced MIMO Systems Optimization
IoT and Edge/Fog Computing

Broadcom (Israel)
2023-2024

Technion – Israel Institute of Technology
2016-2022

Herzliya Medical Center
2020-2022

Kitware (United States)
2021

IBM Research - Haifa
2016-2017

dRMT

OPENALEX - Publications

Sharad Chole Andy Fingerhut Sha Ma Anirudh Sivaraman Shay Vargaftik and 7 more

We present dRMT (disaggregated Reconfigurable Match-Action Table), a new architecture for programmable switches. overcomes two important restrictions of RMT, the predominant pipeline-based switches: (1) table memory is local to an RMT pipeline stage, implying that not used by one stage cannot be reclaimed another, and (2) hardwired always sequentially execute matches followed actions as packets traverse stages. show these make it difficult programs efficiently on RMT.

10.1145/3098822.3098823 article EN 2017-08-04

In-Network Machine Learning Using Programmable Network Devices: A Survey

OPENALEX - Publications

Changgang Zheng Xinpeng Hong Damu Ding Shay Vargaftik Yaniv Ben-Itzhak and 1 more

Machine learning is widely used to solve networking challenges, ranging from traffic classification and anomaly detection network configuration. However, machine also requires significant processing often increases the load on both networks servers. The introduction of in-network computing, enabled by programmable devices, has allowed run applications within network, providing higher throughput lower latency. Soon after, solutions started emerge, enabling functionality itself. This survey...

10.1109/comst.2023.3344351 article EN IEEE Communications Surveys & Tutorials 2023-12-19

IIsy: Hybrid In-Network Classification Using Programmable Switches

OPENALEX - Publications

Changgang Zheng Zhaoqi Xiong Thanh Bui-Tien Siim Kaupmees Riyad Bensoussane and 4 more

The soaring use of machine learning leads to increasing processing demands. As data volume keeps growing, providing classification services with good performance, high throughput, low latency, and minimal equipment overheads becomes a challenge. Offloading tasks network switches can be scalable solution this problem, throughput latency. However, devices are resource constrained, lack support for functionality. In paper, we introduce IIsy -a novel mapping tool models off-the-shelf switches....

10.1109/tnet.2024.3364757 article EN IEEE/ACM Transactions on Networking 2024-02-16

Virtualized Congestion Control

OPENALEX - Publications

Bryce Cronkite-Ratcliff Aran Bergman Shay Vargaftik Madhusudhan Ravi Nick McKeown and 2 more

New congestion control algorithms are rapidly improving datacenters by reducing latency, overcoming incast, increasing throughput and fairness. Ideally, the operating system in every server virtual machine is updated to support new algorithms. However, legacy applications often cannot be upgraded a version, which means advances off-limits them. Worse, as we show, can squeezed out, worst case prevents entire network from adopting

10.1145/2934872.2934889 article EN 2016-08-01

Automating In-Network Machine Learning

OPENALEX - Publications

Changgang Zheng Mingyuan Zang Xinpeng Hong Riyad Bensoussane Shay Vargaftik and 2 more

Using programmable network devices to aid in-network machine learning has been the focus of significant research. However, most research was a limited scope, providing proof concept or describing closed-source algorithm. To date, no general solution provided for mapping algorithms devices. In this paper, we present Planter, an open-source, modular framework trained models Planter supports wide range models, multiple targets and can be easily extended. The evaluation compares different...

10.48550/arxiv.2205.08824 preprint EN other-oa arXiv (Cornell University) 2022-01-01

IIsy: Practical In-Network Classification

OPENALEX - Publications

Changgang Zheng Zhaoqi Xiong Thanh T Bui Siim Kaupmees Riyad Bensoussane and 4 more

The rat race between user-generated data and data-processing systems is currently won by data. increased use of machine learning leads to further increase in processing requirements, while volume keeps growing. To win the race, needs be applied as it goes through network. In-network classification can reduce load on servers, response time scalability. In this paper, we introduce IIsy, implementing models a hybrid fashion using off-the-shelf network devices. IIsy targets three main challenges...

10.48550/arxiv.2205.08243 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Planter: Rapid Prototyping of In-Network Machine Learning Inference

OPENALEX - Publications

Changgang Zheng Mingyuan Zang Xinpeng Hong L.-P. L. Perreault Riyad Bensoussane and 3 more

In-network machine learning inference provides high throughput and low latency. It is ideally located within the network, power efficient, improves applications' performance. Despite its advantages, bar to in-network research high, requiring significant expertise in programmable data planes, addition knowledge of application area. Existing solutions are mostly one-time efforts, hard reproduce, change, or port across platforms. In this paper, we present Planter: a modular efficient...

10.1145/3687230.3687232 article EN ACM SIGCOMM Computer Communication Review 2024-01-30

HACK: Homomorphic Acceleration via Compression of the Key-Value Cache for Disaggregated LLM Inference

OPENALEX - Publications

Zeyu Zhang Haiying Shen Shay Vargaftik Ran Ben Basat Michael Mitzenmacher and 1 more

Disaggregated Large Language Model (LLM) inference has gained popularity as it separates the computation-intensive prefill stage from memory-intensive decode stage, avoiding prefill-decode interference and improving resource utilization. However, transmitting Key-Value (KV) data between two stages can be a bottleneck, especially for long prompts. Additionally, computation time overhead is key optimizing Job Completion Time (JCT), KV size become prohibitive prompts sequences. Existing...

10.48550/arxiv.2502.03589 preprint EN arXiv (Cornell University) 2025-02-05

Counter Pools: Counter Representation for Efficient Stream Processing

OPENALEX - Publications

Ran Ben Basat Gil Einziger Bilal Tyah Shay Vargaftik

Due to the large data volume and number of distinct elements, space is often bottleneck many stream processing systems. The structures used by these systems consist counters whose optimization yields significant memory savings. challenge lies in balancing size counters: too small, they overflow; large, capacity limits their number. In this work, we suggest an efficient encoding scheme that sizes each counter according its needs. Our approach uses fixed-sized pools (e.g., a single word or 64...

10.48550/arxiv.2502.14699 preprint EN arXiv (Cornell University) 2025-02-20

ScionFL: Efficient and Robust Secure Quantized Aggregation

OPENALEX - Publications

Yaniv Ben-Itzhak Helen Möllering Benny Pinkas Thomas Schneider Ajith Suresh and 5 more

Secure aggregation is commonly used in federated learning (FL) to alleviate privacy concerns related the central aggregator seeing all parameter updates clear. Unfortunately, most existing secure schemes ignore two critical orthogonal research directions that aim (i) significantly reduce client-server communication and (ii) mitigate impact of malicious clients. However, both these additional properties are essential facilitate cross-device FL with thousands or even millions (mobile)...

10.1109/satml59370.2024.00031 article EN 2024-04-09

Memento

OPENALEX - Publications

Ran Ben Basat Gil Einziger Isaac Keslassy Ariel Orda Shay Vargaftik and 1 more

Cloud operators require real-time identification of Heavy Hitters (HH) and Hierarchical (HHH) for applications such as load balancing, traffic engineering, attack mitigation. However, existing techniques are slow in detecting new heavy hitters.

10.1145/3281411.3281427 article EN 2018-11-28

LSQ: Load Balancing in Large-Scale Heterogeneous Systems With Multiple Dispatchers

OPENALEX - Publications

Shay Vargaftik Isaac Keslassy Ariel Orda

Nowadays, the efficiency and even feasibility of traditional load-balancing policies are challenged by rapid growth cloud infrastructure increasing levels server heterogeneity. In such heterogeneous systems with many loadbalancers, solutions, as JSQ, incur a prohibitively large communication overhead detrimental incast effects due to herd behavior. Alternative low-communication policies, JSQ(d) recently proposed JIQ, either unstable or provide poor performance. We introduce Local Shortest...

10.1109/tnet.2020.2980061 article EN IEEE/ACM Transactions on Networking 2020-03-31

SALSA: Self-Adjusting Lean Streaming Analytics

OPENALEX - Publications

Ran Ben Basat Gil Einziger Michael Mitzenmacher Shay Vargaftik

Counters are the fundamental building block of many data sketching schemes, which hash items to a small number counters and account for collisions provide good approximations frequencies other measures. Most existing methods rely on fixed-size counters, may be wasteful in terms space, as must large enough eliminate any risk overflow. Instead, some solutions use small, that overflow into secondary structures.This paper takes different approach. We propose simple general method called SALSA...

10.1109/icde51399.2021.00080 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2021-04-01

AnchorHash: A Scalable Consistent Hash

OPENALEX - Publications

Gal Mendelson Shay Vargaftik Katherine Barabash Dean H. Lorenz Isaac Keslassy and 1 more

Consistent hashing is a central building block in many networking applications, such as maintaining connection affinity of TCP flows. However, current consistent solutions do not ensure full consistency under arbitrary changes or scale poorly terms memory footprint, update time and key lookup complexity. We present AnchorHash, scalable fully-consistent algorithm. AnchorHash achieves high rate, low footprint time. formally establish its strong theoretical guarantees, an advanced...

10.1109/tnet.2020.3039547 article EN IEEE/ACM Transactions on Networking 2020-12-10

Characterizing, exploiting, and detecting DMA code injection vulnerabilities in the presence of an IOMMU

OPENALEX - Publications

Alex Markuze Shay Vargaftik Gil Kupfer Boris Pismeny Nadav Amit and 2 more

Direct memory access (DMA) renders a system vulnerable to DMA attacks, in which I/O devices regions not intended for their use. Hardware input-output management units (IOMMU) can be used provide protection. However, an IOMMU cannot prevent all attacks because it only restricts at page-level granularity, leading sub-page vulnerabilities.

10.1145/3447786.3456249 article EN 2021-04-21

Faster and More Accurate Measurement through Additive-Error Counters

OPENALEX - Publications

Ran Ben Basat Gil Einziger Michael Mitzenmacher Shay Vargaftik

Counters are a fundamental building block for networking applications such as load balancing, traffic engineering, and intrusion detection, which require estimating flow sizes identifying heavy hitter flows. Existing works suggest replacing counters with shorter multiplicative error estimators that improve the accuracy by fitting more of them within given space. However, impose computational overhead degrades measurement throughput. Instead, we propose additive estimators, simpler, faster,...

10.1109/infocom41043.2020.9155340 article EN IEEE INFOCOM 2022 - IEEE Conference on Computer Communications 2020-07-01

When ML Training Cuts Through Congestion: Just-in-Time Gradient Compression via Packet Trimming

OPENALEX - Publications

Xiaoqi Chen Shay Vargaftik Ran Ben Basat

10.1145/3696348.3696880 article EN 2024-11-11

RADE: resource-efficient supervised anomaly detection using decision tree-based ensemble methods

OPENALEX - Publications

Shay Vargaftik Isaac Keslassy Ariel Orda Yaniv Ben-Itzhak

10.1007/s10994-021-06047-x article EN Machine Learning 2021-09-03

C-share

OPENALEX - Publications

Shay Vargaftik Cosmin Caba Liran Schour Yaniv Ben-Itzhak

Integrating optical circuit switches in data-centers is an on-going research challenge. In recent years, state-of-the-art solutions introduce hybrid packet/circuit architectures for different switch technologies, control techniques, and traffic rerouting methods. These are based on separated packet planes which do not have the ability to utilize with flows that arrive from or delivered directly connected circuit's end-points. Moreover, current SDN-based elephant flow methods require a...

10.1145/3390251.3390253 article EN ACM SIGCOMM Computer Communication Review 2020-03-23

No Packet Left Behind: Avoiding Starvation in Dynamic Topologies

OPENALEX - Publications

Shay Vargaftik Isaac Keslassy Ariel Orda

Backpressure schemes are known to stabilize stochastic networks through the use of congestion gradients in routing and resource allocation decisions. Nonetheless, these share a significant drawback, namely, delay guarantees obtained only terms average values. As result, arbitrary packets may never reach their destination due both starvation last-packet problems. These problems occur because backpressure schemes, packet scheduling needs subsequent stream produce required gradient for...

10.1109/tnet.2017.2706366 article EN IEEE/ACM Transactions on Networking 2017-06-09

Composite-Path Switching

OPENALEX - Publications

Shay Vargaftik Katherine Barabash Yaniv Ben-Itzhak Ofer Biran Isaac Keslassy and 2 more

Hybrid switching combines a high-bandwidth optical circuit switch in parallel with low-bandwidth electronic packet switch. It presents an appealing solution for scaling datacenter architectures. Unfortunately, it does not fit many traffic patterns produced by typical applications, and particular the skewed that involve highly intensive one-to-many many-to-one communications.

10.1145/2999572.2999610 article EN 2016-11-29

Persistent-Idle Load-Distribution

OPENALEX - Publications

Rami Atar Isaac Keslassy Gal Mendelson Ariel Orda Shay Vargaftik

A parallel server system is considered in which a dispatcher routes incoming jobs to fixed number of heterogeneous servers, each with its own queue. Much effort has been previously made design policies that use limited state information (e.g., the queue lengths small subset set or identity idle servers). However, existing either do not achieve stability region perform poorly terms job completion time. We introduce Persistent-Idle (PI), new, perhaps counterintuitive, load-distribution policy...

10.1287/stsy.2019.0054 article EN cc-by Stochastic Systems 2020-05-27

Memento: Making Sliding Windows Efficient for Heavy Hitters

OPENALEX - Publications

Ran Ben Basat Gil Einziger Isaac Keslassy Ariel Orda Shay Vargaftik and 1 more

Cloud operators require timely identification of Heavy Hitters (HH) and Hierarchical (HHH) for applications such as load balancing, traffic engineering, attack mitigation. However, existing techniques are slow in detecting new heavy hitters. In this paper, we present the case identifying hitters through <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">sliding windows</i> . Sliding windows quicker more accurate to detect than current...

10.1109/tnet.2021.3132385 article EN IEEE/ACM Transactions on Networking 2022-01-29

Coming Soon ...