NFDI4DS | UHH-SEMS - Publication Details

Data center TCP (DCTCP)

OPENALEX - Publications

Mohammad Alizadeh Albert Greenberg David A. Maltz Jitendra Padhye Parveen Patel and 3 more

Cloud data centers host diverse applications, mixing workloads that require small predictable latency with others requiring large sustained throughput. In this environment, today's state-of-the-art TCP protocol falls short. We present measurements of a 6000 server production cluster and reveal impairments lead to high application latencies, rooted in TCP's demands on the limited buffer space available center switches. For example, bandwidth hungry "background" flows build up queues at...

10.1145/1851182.1851192 article EN 2010-08-30

pFabric

OPENALEX - Publications

Mohammad Alizadeh Shuang Yang Milad Sharif Sachin Katti Nick McKeown and 2 more

In this paper we present pFabric, a minimalistic datacenter transport design that provides near theoretically optimal flow completion times even at the 99th percentile for short flows, while still minimizing average time long flows. Moreover, pFabric delivers performance with very simple is based on key conceptual insight: should decouple scheduling from rate control. For scheduling, packets carry single priority number set independently by each flow; switches have small buffers and...

10.1145/2486001.2486031 article EN 2013-08-13

CONGA

OPENALEX - Publications

Mohammad Alizadeh Tom Edsall Sarang Dharmapurikar Ramanan Vaidyanathan Kevin Chu and 6 more

We present the design, implementation, and evaluation of CONGA, a network-based distributed congestion-aware load balancing mechanism for datacenters. CONGA exploits recent trends including use regular Clos topologies overlays network virtualization. It splits TCP flows into flowlets, estimates real-time congestion on fabric paths, allocates flowlets to paths based feedback from remote switches. This enables efficiently balance seamlessly handle asymmetry, without requiring any...

10.1145/2619239.2626316 article EN 2014-08-12

Data center TCP (DCTCP)

OPENALEX - Publications

Mohammad Alizadeh Albert Greenberg David A. Maltz Jitendra Padhye Parveen Patel and 3 more

Cloud data centers host diverse applications, mixing workloads that require small predictable latency with others requiring large sustained throughput. In this environment, today's state-of-the-art TCP protocol falls short. We present measurements of a 6000 server production cluster and reveal impairments lead to high application latencies, rooted in TCP's demands on the limited buffer space available center switches. For example, bandwidth hungry "background" flows build up queues at...

10.1145/1851275.1851192 article EN ACM SIGCOMM Computer Communication Review 2010-08-16

HPCC

OPENALEX - Publications

Yuliang Li Rui Miao Hongqiang Harry Liu Yan Zhuang Fei Feng and 6 more

Congestion control (CC) is the key to achieving ultra-low latency, high bandwidth and network stability in high-speed networks. From years of experience operating large-scale RDMA networks, we find existing CC schemes have inherent limitations for reaching these goals. In this paper, present HPCC (High Precision Control), a new mechanism which achieves three goals simultaneously. leverages in-network telemetry (INT) obtain precise link load information controls traffic precisely. By...

10.1145/3341302.3342085 article EN 2019-08-14

Homa

OPENALEX - Publications

Behnam Montazeri Yilong Li Mohammad Alizadeh John K. Ousterhout

Homa is a new transport protocol for datacenter networks. It provides exceptionally low latency, especially workloads with high volume of very short messages, and it also supports large messages network utilization. uses in-network priority queues to ensure latency messages; allocation managed dynamically by each receiver integrated receiver-driven flow control mechanism. controlled overcommitment downlinks efficient bandwidth utilization at load. Our implementation delivers 99th percentile...

10.1145/3230543.3230564 preprint EN 2018-08-07

pFabric

OPENALEX - Publications

Mohammad Alizadeh Shuang Yang Milad Sharif Sachin Katti Nick McKeown and 2 more

In this paper we present pFabric, a minimalistic datacenter transport design that provides near theoretically optimal flow completion times even at the 99th percentile for short flows, while still minimizing average time long flows. Moreover, pFabric delivers performance with very simple is based on key conceptual insight: should decouple scheduling from rate control. For scheduling, packets carry single priority number set independently by each flow; switches have small buffers and...

10.1145/2534169.2486031 article EN ACM SIGCOMM Computer Communication Review 2013-08-27

Packet Transactions

OPENALEX - Publications

Anirudh Sivaraman Alvin Cheung Mihai Budiu Changhoon Kim Mohammad Alizadeh and 4 more

Many algorithms for congestion control, scheduling, network measurement, active queue management, and traffic engineering require custom processing of packets in the data plane a switch. To run at line rate, these data-plane must be implemented hardware. With today's switch hardware, cannot changed, nor new installed, after has been built.

10.1145/2934872.2934900 article EN 2016-08-01

Language-Directed Hardware Design for Network Performance Monitoring

OPENALEX - Publications

Srinivas Narayana Anirudh Sivaraman Vikram Nathan Prateesh Goyal Venkat Arun and 3 more

Network performance monitoring today is restricted by existing switch support for measurement, forcing operators to rely heavily on endpoints with poor visibility into the network core. Switch vendors have added progressively more features switches, but current trajectory of adding specific unsustainable given ever-changing demands operators. Instead, we ask what hardware primitives are required an expressive language questions. We believe that resulting design could address a wide variety...

10.1145/3098822.3098829 article EN 2017-08-04

Programmable Packet Scheduling at Line Rate

OPENALEX - Publications

Anirudh Sivaraman Suvinay Subramanian Mohammad Alizadeh Sharad Chole Shang-Tse Chuang and 5 more

Switches today provide a small menu of scheduling algorithms. While we can tweak parameters, cannot modify algorithmic logic, or add completely new algorithm, after the switch has been designed. This paper presents design for {\em programmable} packet scheduler, which allows algorithms---potentially algorithms that are unknown today---to be programmed into without requiring hardware redesign.

10.1145/2934872.2934899 article EN 2016-08-01

Bao: Making Learned Query Optimization Practical

OPENALEX - Publications

Ryan Marcus Parimarjan Negi Hongzi Mao Nesime Tatbul Mohammad Alizadeh and 1 more

Query optimization remains one of the most challenging problems in data management systems. Recent efforts to apply machine learning techniques query challenges have been promising, but shown few practical gains due substantive training overhead, inability adapt changes, and poor tail performance. Motivated by these difficulties drawing upon a long history research multi-armed bandits, we introduce Bao (the BAndit Optimizer). takes advantage wisdom built into existing optimizers providing...

10.1145/3448016.3452838 article EN Proceedings of the 2022 International Conference on Management of Data 2021-06-09

CONGA

OPENALEX - Publications

Mohammad Alizadeh Tom Edsall Sarang Dharmapurikar Ramanan Vaidyanathan Kevin Chu and 6 more

We present the design, implementation, and evaluation of CONGA, a network-based distributed congestion-aware load balancing mechanism for datacenters. CONGA exploits recent trends including use regular Clos topologies overlays network virtualization. It splits TCP flows into flowlets, estimates real-time congestion on fabric paths, allocates flowlets to paths based feedback from remote switches. This enables efficiently balance seamlessly handle asymmetry, without requiring any...

10.1145/2740070.2626316 article EN ACM SIGCOMM Computer Communication Review 2014-08-17

Analysis of DCTCP

OPENALEX - Publications

Mohammad Alizadeh Adel Javanmard Balaji Prabhakar

Cloud computing, social networking and information networks (for search, news feeds, etc) are driving interest in the deployment of large data centers. TCP is dominant Layer 3 transport protocol these networks. However, operating conditions---very high bandwidth links, low round-trip times, small-buffered switches---and traffic patterns cause to perform very poorly. The Data Center (DCTCP) algorithm has recently been proposed as a variant for centers addresses shortcomings.

10.1145/1993744.1993753 article EN 2011-06-07

dRMT

OPENALEX - Publications

Sharad Chole Andy Fingerhut Sha Ma Anirudh Sivaraman Shay Vargaftik and 7 more

We present dRMT (disaggregated Reconfigurable Match-Action Table), a new architecture for programmable switches. overcomes two important restrictions of RMT, the predominant pipeline-based switches: (1) table memory is local to an RMT pipeline stage, implying that not used by one stage cannot be reclaimed another, and (2) hardwired always sequentially execute matches followed actions as packets traverse stages. show these make it difficult programs efficiently on RMT.

10.1145/3098822.3098823 article EN 2017-08-04

Millions of little minions

OPENALEX - Publications

Vimalkumar Jeyakumar Mohammad Alizadeh Yilong Geng Changhoon Kim David Mazières

This paper presents a practical approach to rapidly introducing new dataplane functionality into networks: End-hosts embed tiny programs packets actively query and manipulate network's internal state. We show how this "tiny packet program" (TPP) interface gives end-hosts unprecedented visibility network behavior, enabling them work with the achieve desired functionality. Our design leverages what each component does best: (a) switches forward execute (at most 5~instructions) in-band at line...

10.1145/2619239.2626292 article EN 2014-08-12

On the Data Path Performance of Leaf-Spine Datacenter Fabrics

OPENALEX - Publications

Mohammad Alizadeh Tom Edsall

Modern data center networks must support a multitude of diverse and demanding workloads at low cost even the most simple architectural choices can impact mission-critical application performance. This forces network architects to continually evaluate tradeoffs between ideal designs pragmatic, effective solutions. In real commercial environments number parameters that architect control is fairly limited typically includes only choice topology, link speeds, over subscription, switch buffer...

10.1109/hoti.2013.23 article EN 2013-08-01

Neo

OPENALEX - Publications

Ryan Marcus Parimarjan Negi Hongzi Mao Chi Zhang Mohammad Alizadeh and 3 more

Query optimization is one of the most challenging problems in database systems. Despite progress made over past decades, query optimizers remain extremely complex components that require a great deal hand-tuning for specific workloads and datasets. Motivated by this shortcoming inspired recent advances applying machine learning to data management challenges, we introduce Neo ( Neural Optimizer ), novel learning-based optimizer relies on deep neural networks generate executions plans....

10.14778/3342263.3342644 article EN Proceedings of the VLDB Endowment 2019-07-01

Tsunami

OPENALEX - Publications

Jialin Ding Vikram Nathan Mohammad Alizadeh Tim Kraska

Filtering data based on predicates is one of the most fundamental operations for any modern warehouse. Techniques to accelerate execution filter expressions include clustered indexes, specialized sort orders (e.g., Z-order), multi-dimensional and, high selectivity queries, secondary indexes. However, these schemes are hard tune and their performance inconsistent. Recent work learned indexes has introduced idea automatically optimizing an index a particular dataset workload. that suffers in...

10.14778/3425879.3425880 article EN Proceedings of the VLDB Endowment 2020-10-01

Robust Query Driven Cardinality Estimation under Changing Workloads

OPENALEX - Publications

Parimarjan Negi Zi‐Niu Wu Andreas Kipf Nesime Tatbul Ryan Marcus and 3 more

Query driven cardinality estimation models learn from a historical log of queries. They are lightweight, having low storage requirements, fast inference and training, easily adaptable for any kind query. Unfortunately, such can suffer unpredictably bad performance under workload drift, i.e., if the query pattern or data changes. This makes them unreliable hard to deploy. We analyze reasons why become unpredictable due introduce modifications representation neural network training techniques...

10.14778/3583140.3583164 article EN Proceedings of the VLDB Endowment 2023-02-01

Data center transport mechanisms: Congestion control theory and IEEE standardization

OPENALEX - Publications

Mohammad Alizadeh Berk Atikoglu Abdul Kabbani Ashvin Lakshmikantha Rong Pan and 2 more

Data Center Networks present a novel, unique and rich environment for algorithm development deployment. Projects are underway in the IEEE 802.1 standards body, especially Bridging Task Group, to define new switched Ethernet functions data center use. One such project is 802.1Qau, Congestion Notification project, whose aim develop an congestion control hardware implementation. A major contribution of this paper description analysis - QCN, Quantized Notification- which has been developed...

10.1109/allerton.2008.4797706 article EN 2008-09-01

AF-QCN: Approximate Fairness with Quantized Congestion Notification for Multi-tenanted Data Centers

OPENALEX - Publications

Abdul Kabbani Mohammad Alizadeh Masato Yasuda Rong Pan Balaji Prabhakar

Data Center Networks represent the convergence of computing and networking, data storage networks, packet transport mechanisms in Layers 2 3. Congestion control algorithms are a key component this type network. Recently, Layer congestion management algorithm, called QCN (Quantized Notification), has been adopted for IEEE 802.1 Bridging standard: 802.1Qau. The algorithm designed to be stable, responsive, simple implement. However, it does not provide weighted fairness, where weights can set...

10.1109/hoti.2010.26 article EN 2010-08-01

NUMFabric

OPENALEX - Publications

Kanthi Nagaraj Dinesh Bharadia Hongzi Mao Sandeep Chinchali Mohammad Alizadeh and 1 more

We present xFabric, a novel datacenter transport design that provides flexible and fast bandwidth allocation control. xFabric is flexible: it enables operators to specify how allocated amongst contending flows optimize for different service-level objectives such as minimizing flow completion times, weighted allocations, notions of fairness, etc. also very fast, converges the specified one-to-two order magnitudes faster than prior schemes. Underlying distributed algorithm uses in-network...

10.1145/2934872.2934890 article EN 2016-08-01