- Caching and Content Delivery
- Software-Defined Networks and 5G
- Internet Traffic Analysis and Secure E-voting
- Peer-to-Peer Network Technologies
- Network Traffic and Congestion Control
- Network Security and Intrusion Detection
- Cloud Computing and Resource Management
- Wireless Networks and Protocols
- Mobile Ad Hoc Networks
- Green IT and Sustainability
- IPv6, Mobility, Handover, Networks, Security
- Neural Networks and Applications
- Interconnection Networks and Systems
- RFID technology advancements
- Advanced Data Storage Technologies
- Tactile and Sensory Interactions
- Software System Performance and Reliability
- Spam and Phishing Detection
- Stalking, Cyberstalking, and Harassment
- Network Packet Processing and Optimization
- IoT and Edge/Fog Computing
- Bluetooth and Wireless Communication Technologies
- Privacy, Security, and Data Protection
- Stochastic Gradient Optimization Techniques
- Web Data Mining and Analysis
Columbia University
2020-2024
Meta (United States)
2023
Menlo School
2023
Microsoft (Germany)
2020-2022
Microsoft Research (United Kingdom)
2016-2021
University of Southern California
2013-2016
Southern California University for Professional Studies
2014-2016
Microsoft (United States)
2015
University of Edinburgh
2010-2012
University of Massachusetts Boston
2007-2009
Modern content-distribution networks both provide bulk content and act as "serving infrastructure" for web services in order to reduce user-perceived latency. Serving infrastructures such Google's are now critical the online economy, making it imperative understand their size, geographic distribution, growth strategies. To this end, we develop techniques that enumerate IP addresses of servers these infrastructures, find location, identify association between clients clusters servers. While...
Content delivery networks must balance a number of trade-offs when deciding how to direct client CDN server. Whereas DNS-based redirection requires complex global traffic manager, anycast depends on BGP front-end. Anycast is simple operate, scalable, and naturally resilient DDoS attacks. This simplicity, however, comes at the cost precise control redirection. We examine performance implications using in global, latency-sensitive, CDN. analyze millions client-side measurements from Bing...
The Tier-1 ISPs have been considered the Internet's backbone since dawn of modern Internet 30 years ago, as they guarantee global reachability. However, their influence and importance are waning flattening decreases demand for transit services increases private interconnections. Conversely, major cloud providers -- Amazon, Google, IBM, Microsoft-- gaining in more hosted on infrastructures. They ardently support rapidly expanding footprints, which enables them to bypass other large reach many...
Content Hypergiants deliver the vast majority of Internet traffic to end users. In recent years, some have invested heavily in deploying services and servers inside end-user networks. With several dozen thousands deployed networks, these off-net (meaning outside Hypergiant networks) deployments change structure Internet. Previous efforts study them relied on proprietary data or specialized per-Hypergiant measurement techniques that neither scale nor generalize, providing a limited view...
Data centers must support a range of workloads with differing demands. Although existing approaches handle routine traffic smoothly, intense hotspots--even if ephemeral--cause excessive packet loss and severely degrade performance. This occurs even though congestion is typically highly localized, spare buffer capacity at nearby switches. In this paper, we argue that switches should share to effectively spot without the monetary hit deploying large buffers individual Specifically, present...
The network communications between the cloud and client have become weak link for global services that aim to provide low latency their clients. In this paper, we first characterize WAN from viewpoint of a large provider Azure, whose edges serve hundreds billions TCP connections day across locations worldwide. particular, focus on instances degradation design tool, BlameIt, enables operators localize cause (i.e., faulty AS) such degradation. BlameIt uses passive diagnosis, using measurements...
Anycast is used to serve content including web pages and DNS, anycast deployments are growing. However, prior work examining root DNS suggests incur significant inflation, with users often routed suboptimal sites. We reassess performance, first extending analysis on inflation in the DNS. show that very common affecting more than 95\% of users. we then latency \emph{hardly matters} because caching so effective. These findings lead us question: inherent anycast, or can be limited when it...
The construction of private WANs by cloud providers enables them to extend their networks more locations and establish direct connectivity with end user ISPs. Tenants the benefit from this proximity users, which is supposed provide improved performance bypassing public Internet. However, impact providers' not widely understood.To isolate a WAN, we measure globally distributed vantage points two large providers, comparing when using worldwide WAN instead benefits are universal. While 48% our...
To mitigate IPv4 exhaustion, IPv6 provides expanded address space, and NAT allows a single public to suffice for many devices assigned private space. Even though has greatly extended the shelf-life of IPv4, some networks need more space than what is officially allocated by IANA due their size and/or network management practices. Some these resort using squat , term operations community uses large blocks organizations but historically never announced Internet. While squatting IP addresses an...
Recurrent applications that mostly run in the background are a significant source of power consumption on battery-limited mobile phones. We highlight pitfalls scheduling such independently without awareness each other's schedules. illustrate energy savings can be achieved via batch recurrent phone applications. then present our on-going work developing general framework for and also outline early experiences studying benefit two different platforms-Nokia N95 HTC (Android) - commonly used...
Online services all seek to provide their customers with the best Quality of Experience (QoE) possible. Milliseconds delay can cause users abandon a cat video or move onto different shopping site, which translates into lost revenue. Thus, minimizing latency between and content is crucial. To reduce latency, cloud providers have built massive, global networks. However, networks must interact customer ISPs via BGP, has no concept performance.
The IPv4 Record Route (RR) Option instructs routers to record their IP addresses in a packet. RR is subject nine hop limit and, traditionally, inconsistent support from routers. Recent changes interdomain connectivity---the so-called "flattening Internet"---and new best practices for how should handle packets suggest that now good time reassess the potential of Option.
Google, Netflix, Meta, and Akamai serve content to users from offnet servers in thousands of ISPs. These offnets benefit both services ISPs, via better performance reduced interdomain WAN traffic. We argue that this widespread distribution leads a concentration traffic previously unacknowledged risk, as many ISPs colocate multiple providers. This trend contributes Internet likely accessing popular fetching the majority their single facility -- perhaps even rack creating shared resources...
Directing users to a nearby, high-performing front-end is core the business of content delivery networks (CDNs). CDNs which use DNS direct servers face challenge making decisions based at LDNS-level, not on client's IP address, and, in many cases, an LDNS representative clients it serves. The EDNS Client Subnet specification provides solution by embedding portion address query help make better redirection decisions, but both and authoritative resolver (CDN side) must support standard. While...
Many systems rely on traceroutes to monitor or characterize the Internet. The quality of systems' inferences depends completeness and freshness traceroutes, but refreshing is constrained by limited resources at vantage points. Previous approaches predict which are likely out-of-date in order allocate measurements, BGP feeds for changes that overlap traceroutes. Both miss many path reasons including difficulty predicting coarse granularity paths.
Today's data centers must support a range of workloads with different demands. While existing approaches handle routine traffic smoothly, ephemeral but intense hotspots cause excessive packet loss and severely degrade performance. This occurs even though the congestion is typically highly localized, spare buffer capacity available at nearby switches.
Monitoring performance and availability are critical to operating successful content distribution networks. Internet measurements provide the data needed for traffic engineering, alerting, network diagnostics. While there significant benefits performing end-user active measurements, these capabilities limited a small number of providers with application control. In this work, we present solution long-standing problem issuing from clients without requiring control, e.g., injecting JavaScript...
Content delivery networks (CDNs) provide fast service to clients by replicating content at geographically distributed sites. Most CDNs route a particular site using anycast or unicast with DNS-based redirection. We analyze and explain why neither of them provides both precise control user-to-site mapping high availability in the face failures, two fundamental goals CDNs. Anycast compromises (and hence performance), availability. then present new hybrid techniques demonstrate via experiments...
Does an outage impact any users? Can a geolocation database known to be good at locating users and bad infrastructure trusted for particular prefix? Is content-heavy network likely peer with network? For these questions many more, knowing which prefixes contain Internet aids in interpreting analysis. However, existing datasets of activity are out date, unvalidated, based on privileged data, or too coarse. As step towards identifying IP users, we present multiple novel techniques identify...