Shiyao Ma

ORCID: 0000-0002-9013-789X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Cloud Computing and Resource Management
  • Software-Defined Networks and 5G
  • Parallel Computing and Optimization Techniques
  • Network Traffic and Congestion Control
  • Peer-to-Peer Network Technologies
  • Distributed and Parallel Computing Systems
  • Interconnection Networks and Systems
  • Advanced Optical Network Technologies
  • Privacy-Preserving Technologies in Data
  • Advanced Data Storage Technologies
  • Caching and Content Delivery
  • IoT and Edge/Fog Computing
  • Cooperative Communication and Network Coding
  • Distributed Sensor Networks and Detection Algorithms

Hong Kong University of Science and Technology
2016-2018

University of Hong Kong
2016-2018

Fair and efficient coflow scheduling improves application-level networking performance in today's datacenters. Ideally, a scheduler should provide isolation guarantees on the minimum progress to achieve predictable performance. Network operators, other hand, strive decrease average completion time (CCT). Unfortunately, optimal CCT are conflicting objectives cannot be achieved at same time. Existing schedulers either optimize expense of long CCTs (e.g., HUG [1]), or without Varys Aalo [2],...

10.1109/infocom.2017.8057172 article EN IEEE INFOCOM 2022 - IEEE Conference on Computer Communications 2017-05-01

BBR is a new congestion-based congestion control algorithm proposed by Google. A flow sequentially measures the bottleneck bandwidth and round-trip delay of network pipe, uses measured results to govern its sending behavior, maximizing delivery while minimizing delay. However, our deployment in geo-distributed cloud servers reveals severe RTT fairness problem: with longer dominates competing shorter RTT. Somewhat surprisingly, on Internet an in-house cluster unearthed consistent disparity...

10.48550/arxiv.1706.09115 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Even with the recent proliferation of in-memory computation in data-parallel frameworks (such as Spark), transfers over network are still time-consuming. Similar to computation, serve main roadblocks we try minimize job completion times. Existing schedulers were designed isolated solutions that focused on or performance only. Without any coordination, utilization and resources may become unbalanced, leading a reduced level overall resource utilization. In this paper, design, implement,...

10.1109/infocom.2016.7524415 article EN 2016-04-01

BBR is a congestion-based congestion control algorithm recently proposed by Google. It proactively measures the bottleneck bandwidth and round trip times (RTTs) of connection pipe, based on which it governs its sending behaviors. Despite significant throughput gains latency reduction, some experimental studies reveal that may result in salient RTT-fairness problem, short-RTT flows can be starved allocation when comnetina with lons-R'I'T flows. In this paper, we study BBR's problem from...

10.1109/glocom.2018.8647260 article EN 2015 IEEE Global Communications Conference (GLOBECOM) 2018-12-01

Guaranteed performance for data-parallel applications is important both service providers and cloud data centers that host such services. A job of involves communication among multiple machines to transmit intermediate results. Such comprises a collection parallel flows, which abstracted as coflow in recent proposals. In this paper, we study the problem meeting deadlines coflows center networks. Existing flow-level scheduling schemes are insufficient guarantee coflow-level performance, since...

10.1109/icc.2016.7511249 article EN 2016-05-01

Tasks in a data-parallel job communicate with each other through number of concurrent flows, which is described as coflow. These flows are correlated the sense that performance coflow dictated by flow takes longest time to complete. Minimizing completion times, however, turns out be challenge, given correlation across and how they routed collectively datacenter network. In this paper, we propose Tailor, simple yet effective mechanism objective trimming times To achieve our objective, Tailor...

10.1109/icccn.2016.7568579 article EN 2016-08-01

Link utilization has received extensive attention since data centers become the most pervasive platform for data-parallel applications. A specific job of such applications involves communication among multiple machines. The recently proposed coflow abstraction depicts through a group parallel flows, and captures application performance corresponding requirements. Existing techniques to improve link utilization, however, either restrict themselves achieving work conservation, or merely focus...

10.1109/tcc.2016.2628891 article EN IEEE Transactions on Cloud Computing 2016-11-16

Data-parallel applications, especially those associated with user-facing web services, have struggled to enhance their worst case performance. It is therefore important improve the minimum amount of resources guaranteed for applications in a cluster. Existing cluster management frameworks, however, provide isolation computation (such as CPU) only, and are oblivious network guarantees. In this paper, we design, implement evaluate Libra, new framework that helps maximize guarantee bandwidth...

10.1109/icnp.2016.7784434 article EN 2016-11-01

With the advent of big data processing frameworks, performance data-parallel applications is heavily affected by time it takes to read input data, making important improve locality. Existing methods in achieving locality have primarily focused on selecting machines place tasks applications. Nevertheless, set that an application can choose from determined a cluster manager, which oblivious location existing resource sharing frameworks. In this paper, we design, implement and evaluate Custody,...

10.1109/cluster.2016.59 article EN 2016-09-01

User selection has become crucial for decreasing the communication costs of federated learning (FL) over wireless networks. However, centralized user causes additional system complexity. This study proposes a network intrinsic approach distributed that leverages radio resource competition mechanism in random access. Taking carrier sensing multiple access (CSMA) as an example access, we manipulate contention window (CW) size to prioritize certain users obtaining resources each round training....

10.48550/arxiv.2307.03758 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Link utilization has received extensive attention since datacenters become the most prevalent platform for data-parallel computing applications. A specific job of such applications involves communication among multiple machines. The coflow abstraction depicts and captures application performance through corresponding network requirements. Existing techniques to improve link utilization, however, either restrict themselves work conservation, or merely focus on flow-level metrics ignore...

10.1109/icc.2017.7996679 article EN 2017-05-01
Coming Soon ...