Shay Vargaftik

ORCID: 0000-0002-0982-7894
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Cloud Computing and Resource Management
  • Network Security and Intrusion Detection
  • Distributed and Parallel Computing Systems
  • Privacy-Preserving Technologies in Data
  • Software-Defined Networks and 5G
  • Internet Traffic Analysis and Secure E-voting
  • Network Traffic and Congestion Control
  • Stochastic Gradient Optimization Techniques
  • Sparse and Compressive Sensing Techniques
  • Advanced Queuing Theory Analysis
  • Distributed Sensor Networks and Detection Algorithms
  • Anomaly Detection Techniques and Applications
  • Software System Performance and Reliability
  • Advanced Optical Network Technologies
  • Advanced Wireless Network Optimization
  • Distributed systems and fault tolerance
  • Network Packet Processing and Optimization
  • Data Stream Mining Techniques
  • Cryptography and Data Security
  • Parallel Computing and Optimization Techniques
  • Age of Information Optimization
  • Interconnection Networks and Systems
  • Advanced Malware Detection Techniques
  • Advanced MIMO Systems Optimization
  • IoT and Edge/Fog Computing

Broadcom (Israel)
2023-2024

Technion – Israel Institute of Technology
2016-2022

Herzliya Medical Center
2020-2022

Kitware (United States)
2021

IBM Research - Haifa
2016-2017

We present dRMT (disaggregated Reconfigurable Match-Action Table), a new architecture for programmable switches. overcomes two important restrictions of RMT, the predominant pipeline-based switches: (1) table memory is local to an RMT pipeline stage, implying that not used by one stage cannot be reclaimed another, and (2) hardwired always sequentially execute matches followed actions as packets traverse stages. show these make it difficult programs efficiently on RMT.

10.1145/3098822.3098823 article EN 2017-08-04

Machine learning is widely used to solve networking challenges, ranging from traffic classification and anomaly detection network configuration. However, machine also requires significant processing often increases the load on both networks servers. The introduction of in-network computing, enabled by programmable devices, has allowed run applications within network, providing higher throughput lower latency. Soon after, solutions started emerge, enabling functionality itself. This survey...

10.1109/comst.2023.3344351 article EN IEEE Communications Surveys & Tutorials 2023-12-19

The soaring use of machine learning leads to increasing processing demands. As data volume keeps growing, providing classification services with good performance, high throughput, low latency, and minimal equipment overheads becomes a challenge. Offloading tasks network switches can be scalable solution this problem, throughput latency. However, devices are resource constrained, lack support for functionality. In paper, we introduce IIsy -a novel mapping tool models off-the-shelf switches....

10.1109/tnet.2024.3364757 article EN IEEE/ACM Transactions on Networking 2024-02-16

New congestion control algorithms are rapidly improving datacenters by reducing latency, overcoming incast, increasing throughput and fairness. Ideally, the operating system in every server virtual machine is updated to support new algorithms. However, legacy applications often cannot be upgraded a version, which means advances off-limits them. Worse, as we show, can squeezed out, worst case prevents entire network from adopting

10.1145/2934872.2934889 article EN 2016-08-01

Using programmable network devices to aid in-network machine learning has been the focus of significant research. However, most research was a limited scope, providing proof concept or describing closed-source algorithm. To date, no general solution provided for mapping algorithms devices. In this paper, we present Planter, an open-source, modular framework trained models Planter supports wide range models, multiple targets and can be easily extended. The evaluation compares different...

10.48550/arxiv.2205.08824 preprint EN other-oa arXiv (Cornell University) 2022-01-01

The rat race between user-generated data and data-processing systems is currently won by data. increased use of machine learning leads to further increase in processing requirements, while volume keeps growing. To win the race, needs be applied as it goes through network. In-network classification can reduce load on servers, response time scalability. In this paper, we introduce IIsy, implementing models a hybrid fashion using off-the-shelf network devices. IIsy targets three main challenges...

10.48550/arxiv.2205.08243 preprint EN other-oa arXiv (Cornell University) 2022-01-01

In-network machine learning inference provides high throughput and low latency. It is ideally located within the network, power efficient, improves applications' performance. Despite its advantages, bar to in-network research high, requiring significant expertise in programmable data planes, addition knowledge of application area. Existing solutions are mostly one-time efforts, hard reproduce, change, or port across platforms. In this paper, we present Planter: a modular efficient...

10.1145/3687230.3687232 article EN ACM SIGCOMM Computer Communication Review 2024-01-30

Disaggregated Large Language Model (LLM) inference has gained popularity as it separates the computation-intensive prefill stage from memory-intensive decode stage, avoiding prefill-decode interference and improving resource utilization. However, transmitting Key-Value (KV) data between two stages can be a bottleneck, especially for long prompts. Additionally, computation time overhead is key optimizing Job Completion Time (JCT), KV size become prohibitive prompts sequences. Existing...

10.48550/arxiv.2502.03589 preprint EN arXiv (Cornell University) 2025-02-05

Due to the large data volume and number of distinct elements, space is often bottleneck many stream processing systems. The structures used by these systems consist counters whose optimization yields significant memory savings. challenge lies in balancing size counters: too small, they overflow; large, capacity limits their number. In this work, we suggest an efficient encoding scheme that sizes each counter according its needs. Our approach uses fixed-sized pools (e.g., a single word or 64...

10.48550/arxiv.2502.14699 preprint EN arXiv (Cornell University) 2025-02-20

Secure aggregation is commonly used in federated learning (FL) to alleviate privacy concerns related the central aggregator seeing all parameter updates clear. Unfortunately, most existing secure schemes ignore two critical orthogonal research directions that aim (i) significantly reduce client-server communication and (ii) mitigate impact of malicious clients. However, both these additional properties are essential facilitate cross-device FL with thousands or even millions (mobile)...

10.1109/satml59370.2024.00031 article EN 2024-04-09

Cloud operators require real-time identification of Heavy Hitters (HH) and Hierarchical (HHH) for applications such as load balancing, traffic engineering, attack mitigation. However, existing techniques are slow in detecting new heavy hitters.

10.1145/3281411.3281427 article EN 2018-11-28

Nowadays, the efficiency and even feasibility of traditional load-balancing policies are challenged by rapid growth cloud infrastructure increasing levels server heterogeneity. In such heterogeneous systems with many loadbalancers, solutions, as JSQ, incur a prohibitively large communication overhead detrimental incast effects due to herd behavior. Alternative low-communication policies, JSQ(d) recently proposed JIQ, either unstable or provide poor performance. We introduce Local Shortest...

10.1109/tnet.2020.2980061 article EN IEEE/ACM Transactions on Networking 2020-03-31

Counters are the fundamental building block of many data sketching schemes, which hash items to a small number counters and account for collisions provide good approximations frequencies other measures. Most existing methods rely on fixed-size counters, may be wasteful in terms space, as must large enough eliminate any risk overflow. Instead, some solutions use small, that overflow into secondary structures.This paper takes different approach. We propose simple general method called SALSA...

10.1109/icde51399.2021.00080 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2021-04-01

Consistent hashing is a central building block in many networking applications, such as maintaining connection affinity of TCP flows. However, current consistent solutions do not ensure full consistency under arbitrary changes or scale poorly terms memory footprint, update time and key lookup complexity. We present AnchorHash, scalable fully-consistent algorithm. AnchorHash achieves high rate, low footprint time. formally establish its strong theoretical guarantees, an advanced...

10.1109/tnet.2020.3039547 article EN IEEE/ACM Transactions on Networking 2020-12-10

Direct memory access (DMA) renders a system vulnerable to DMA attacks, in which I/O devices regions not intended for their use. Hardware input-output management units (IOMMU) can be used provide protection. However, an IOMMU cannot prevent all attacks because it only restricts at page-level granularity, leading sub-page vulnerabilities.

10.1145/3447786.3456249 article EN 2021-04-21

Counters are a fundamental building block for networking applications such as load balancing, traffic engineering, and intrusion detection, which require estimating flow sizes identifying heavy hitter flows. Existing works suggest replacing counters with shorter multiplicative error estimators that improve the accuracy by fitting more of them within given space. However, impose computational overhead degrades measurement throughput. Instead, we propose additive estimators, simpler, faster,...

10.1109/infocom41043.2020.9155340 article EN IEEE INFOCOM 2022 - IEEE Conference on Computer Communications 2020-07-01

Integrating optical circuit switches in data-centers is an on-going research challenge. In recent years, state-of-the-art solutions introduce hybrid packet/circuit architectures for different switch technologies, control techniques, and traffic rerouting methods. These are based on separated packet planes which do not have the ability to utilize with flows that arrive from or delivered directly connected circuit's end-points. Moreover, current SDN-based elephant flow methods require a...

10.1145/3390251.3390253 article EN ACM SIGCOMM Computer Communication Review 2020-03-23

Backpressure schemes are known to stabilize stochastic networks through the use of congestion gradients in routing and resource allocation decisions. Nonetheless, these share a significant drawback, namely, delay guarantees obtained only terms average values. As result, arbitrary packets may never reach their destination due both starvation last-packet problems. These problems occur because backpressure schemes, packet scheduling needs subsequent stream produce required gradient for...

10.1109/tnet.2017.2706366 article EN IEEE/ACM Transactions on Networking 2017-06-09

Hybrid switching combines a high-bandwidth optical circuit switch in parallel with low-bandwidth electronic packet switch. It presents an appealing solution for scaling datacenter architectures. Unfortunately, it does not fit many traffic patterns produced by typical applications, and particular the skewed that involve highly intensive one-to-many many-to-one communications.

10.1145/2999572.2999610 article EN 2016-11-29

A parallel server system is considered in which a dispatcher routes incoming jobs to fixed number of heterogeneous servers, each with its own queue. Much effort has been previously made design policies that use limited state information (e.g., the queue lengths small subset set or identity idle servers). However, existing either do not achieve stability region perform poorly terms job completion time. We introduce Persistent-Idle (PI), new, perhaps counterintuitive, load-distribution policy...

10.1287/stsy.2019.0054 article EN cc-by Stochastic Systems 2020-05-27

Cloud operators require timely identification of Heavy Hitters (HH) and Hierarchical (HHH) for applications such as load balancing, traffic engineering, attack mitigation. However, existing techniques are slow in detecting new heavy hitters. In this paper, we present the case identifying hitters through <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">sliding windows</i> . Sliding windows quicker more accurate to detect than current...

10.1109/tnet.2021.3132385 article EN IEEE/ACM Transactions on Networking 2022-01-29
Coming Soon ...