- Software Reliability and Analysis Research
- Distributed systems and fault tolerance
- Cloud Computing and Resource Management
- Software System Performance and Reliability
- Parallel Computing and Optimization Techniques
- Radiation Effects in Electronics
- Fault Detection and Control Systems
- Risk and Safety Analysis
- Safety Systems Engineering in Autonomy
- Real-Time Systems Scheduling
- Advanced Data Storage Technologies
- IoT and Edge/Fog Computing
- Distributed and Parallel Computing Systems
- Silicon Carbide Semiconductor Technologies
- Advanced Malware Detection Techniques
- IPv6, Mobility, Handover, Networks, Security
- Security and Verification in Computing
- Reliability and Maintenance Optimization
- Software Engineering Research
- Advanced Software Engineering Methodologies
- VLSI and Analog Circuit Testing
- Software Testing and Debugging Techniques
- Network Traffic and Congestion Control
- Green IT and Sustainability
- Software-Defined Networks and 5G
Yale University
2025
University of Washington
2003-2024
Seattle University
2023-2024
University of Utah
2020-2021
George E. Wahlen Department of VA Medical Center
2020
Southern California University for Professional Studies
2018
University of Southern California
2018
Mitchell Institute
2018
Texas A&M University
2018
University of California, Los Angeles
2018
To date, realistic ISP topologies have not been accessible to the research community, leaving work that depends on topology an uncertain footing. In this paper, we present new Internet mapping techniques enabled us measure router-level topologies. Our reduce number of required traces compared a brute-force, all-to-all approach by three orders magnitude without significant loss in accuracy. They include use BGP routing tables focus measurements, elimination redundant measurements exploiting...
We motivate the capability approach to network denial-of-service (DoS) attacks, and evaluate traffic validation architecture (TVA) which builds on capabilities. With our approach, rather than send packets any destination at time, senders must first obtain ldquopermission sendrdquo from receiver, provides permission in form of capabilities those whose it agrees accept. The then include these packets. This enables verification points distributed around check that has been authorized by...
Real-time systems often have very high reliability requirements and are therefore prime candidates for the inclusion of fault tolerance techniques. In order to provide software faults, some form state restoration is usually advocated as a means recovery. State can be expensive cost exacerbated which utilize concurrent processes. The concurrency present in most real-time further difficulties introduced by timing constraints suggest that providing faults may inordinately or complex. We believe...
In order to assess the effectiveness of software fault tolerance techniques for enhancing reliability practical systems, a major experimental project has been conducted at University Newcastle upon Tyne. Techniques were developed for, and applied to, realistic implementation real-time system (a naval command control system). Reliability data collected by operating this in simulated tactical environment variety action scenarios. This paper provides an overview presents results three phases...
The need for reliable complex systems motivates the development of techniques by which acceptable service can be maintained, even in presence residual errors. Recovery blocks allow a software designer to include tests on acceptability various phases system's operation, and specify alternative actions should acceptance fail. This approach relies certain architectural features, ideally implemented hardware, control data structures retrieved after errors.A brief account is presented recovery...
The end of Dennard scaling and the slowing Moore's Law has put energy use datacenters on an unsustainable path. Datacenters are already a significant fraction worldwide electricity use, with application demand at rapid rate. We argue that substantial reductions in carbon intensity datacenter computing possible software-centric approach: by making visible to developers fine-grained basis, modifying system APIs make it informed trade offs between performance emissions, raising level...
As modern server GPUs are increasingly power intensive, better management mechanisms can significantly reduce the consumption, capital costs, and carbon emissions in large cloud datacenters. This letter uses diverse datacenter workloads to study capabilities of GPUs. We find that current GPU have limited compatibility monitoring support under virtualization. They sub-optimal, imprecise, non-intuitive implementations Dynamic Voltage Frequency Scaling (DVFS) capping. Consequently, efficient is...
Kernel task scheduling is important for application performance, adaptability to new hardware, and complex user requirements. However, developing, testing, debugging algorithms in Linux, the most widely used cloud operating system, slow difficult. We developed Enoki, a framework high velocity development of Linux kernel schedulers. Enoki schedulers are written safe Rust, system supports live upgrade policies into kernel, userspace debugging, bidirectional communication with applications. A...
Backward error recovery (that is, resetting an erroneous state of a system to previous error-free state) is important general technique for from faults in system, especially those which were not foreseen. However, the provision backward can be complex, particularly if implementation "multilever" and provided at number these levels. This paper discusses two distinct categories multilevel then examines detail issues involved providing both types system.
Researchers have shown that the Internet exhibits path inflation -- end-to-end paths can be significantly longer than necessary. We present a trace-driven study of 65 ISPs characterizes root causes inflation, namely topology and routing policy choices within an ISP, between pairs ISPs, across global Internet. To do so, we develop validate novel techniques to infer intra-domain peering policies from measurements. provide first measured characterization ISP policies. In addition "early-exit,"...