- Cryptography and Data Security
- Privacy-Preserving Technologies in Data
- Distributed systems and fault tolerance
- Internet Traffic Analysis and Secure E-voting
- Advanced Data Storage Technologies
- Peer-to-Peer Network Technologies
- Scientific Computing and Data Management
- Caching and Content Delivery
- Network Security and Intrusion Detection
- Software System Performance and Reliability
- Distributed and Parallel Computing Systems
- Software-Defined Networks and 5G
- Security and Verification in Computing
- Parallel Computing and Optimization Techniques
- Cloud Data Security Solutions
- Real-Time Systems Scheduling
- Data Quality and Management
- Advanced Malware Detection Techniques
- Cloud Computing and Resource Management
- Interconnection Networks and Systems
- Blockchain Technology Applications and Security
- Opportunistic and Delay-Tolerant Networks
- Complexity and Algorithms in Graphs
- Cryptographic Implementations and Security
- Research Data Management Practices
University of Pennsylvania
2012-2025
California University of Pennsylvania
2011-2021
Philadelphia University
2014
Pennsylvania State University
2013
Max Planck Institute for Software Systems
2005-2011
Max Planck Society
2006-2011
Rice University
2004-2007
Karlsruhe Institute of Technology
2000
We demonstrate a system built using probabilistic techniques that allows for remarkably accurate localization across our entire office building nothing more than the built-in signal intensity meter supplied by standard 802.11 cards. While prior systems have required significant investments of human labor to build detailed map, we can train spending less one minute per or region, walking around with laptop and recording observed intensities building's unmodified base stations. actually...
A large and rapidly growing proportion of users connect to the Internet via residential broadband networks such as Digital Subscriber Lines (DSL) cable. Residential are often bottleneck in last mile today's Internet. Their characteristics critically affect applications, including voice-over-IP, online games, peer-to-peer content sharing/delivery systems. However, date, few studies have investigated commercial deployments, rigorous measurement data that characterize these at scale lacking.In...
We describe PeerReview, a system that provides accountability in distributed systems. PeerReview ensures Byzantine faults whose effects are observed by correct node eventually detected and irrefutably linked to faulty node. At the same time, can always defend itself against false accusations. These guarantees particularly important for systems span multiple administrative domains, which may not trust each other.PeerReview works maintaining secure record of messages sent received The isused...
Differential privacy is becoming a gold standard for research; it offers guaranteed bound on loss of due to release query results, even under worst-case assumptions. The theory differential an active research area, and there are now differentially private algorithms wide range interesting problems. However, the question when works in practice has received relatively little attention. In particular, still no rigorous method choosing key parameter $\epsilon$, which controls crucial tradeoff...
Decentralized storage systems aggregate the available disk space of participating computers to provide a large facility. These rely on data redundancy ensure durable despite node failures. However, existing either assume independent failures, or they introspection carefully place redundant nodes with low expected failure correlation. Unfortunately, failures are not in practice and constructing an accurate model is difficult large-scale systems. At same time, malicious worms that propagate...
Differential privacy offers a way to answer queries about sensitive information while providing strong, provable guarantees, ensuring that the presence or absence of single individual in database has negligible statistical effect on query's result. Proving given query this property involves establishing bound sensitivity---how much its result can change when record is added removed.
For many companies, clouds are becoming an interesting alternative to a dedicated IT infrastructure. However, cloud computing also carries certain risks for both the customer and provider. The places his computation data on machines he cannot directly control; provider agrees run service whose details does not know. If something goes wrong - example, leaks competitor, or returns incorrect results it can be difficult determinewhich of themhas caused problem, and, in absence solid evidence, is...
This paper introduces secure network provenance (SNP), a novel technique that enables networked systems to explain their operators why they are in certain state -- e.g., suspicious routing table entry is present on router, or where given cache originated. SNP provides forensics capabilities by permitting track down faulty misbehaving nodes, and assess the damage such nodes may have caused rest of system. designed for adversarial settings robust manipulation; its tamper-evident properties...
Recently, it has been reported that certain access ISPs are surreptitiously blocking their customers from uploading data using the popular BitTorrent file-sharing protocol. The reports have sparked an intense and wide-ranging policy debate on network neutrality ISP traffic management practices. However, to date, end users lack measurement tools can detect whether traffic. And since do not voluntarily disclose policies, no one knows how widely is deployed in current Internet. In this paper,...
In this paper, we introduce accountable virtual machines (AVMs). Like ordinary machines, AVMs can execute binary software images in a virtualized copy of computer system; addition, they record non-repudiable information that allows auditors to subsequently check whether the behaved as intended. provide strong accountability, which is important, for instance, distributed systems where different hosts and organizations do not necessarily trust each other, or hosted on third-party operated...
Content distribution systems have traditionally adopted one of two architectures: infrastructure-based content delivery networks (CDNs), in which clients download from dedicated, centrally managed servers, and peer-to-peer CDNs, each other. The advantages disadvantages architecture been studied great detail. Recently, hybrid, or 'peer-assisted', CDNs emerged, combine elements both architectures. properties such systems, however, are not as well understood.
In this paper, we study the problem of answering queries about private data that is spread across multiple different databases. For instance, a medical researcher may want to possible correlation between travel patterns and certain types illnesses. The necessary information exists today - e.g., in airline reservation systems hospital records but it maintained by two separate companies who are prevented law from sharing with each other, or third party. This separation prevents processing such...
Today, enterprises collect large amounts of data and leverage the cloud to perform analytics over this data. Since is often sensitive, would prefer keep it confidential hide even from operator. Systems such as CryptDB Monomi can accomplish by operating mostly on encrypted data; however, these systems rely expensive cryptographic techniques that limit performance in true big scenarios involve terabytes or more.This paper presents Seabed, a system enables efficient datasets. In contrast...
Discovering the causes of incorrect behavior in large networks is often difficult. This difficulty compounded when some machines network are compromised, since these compromised may use deception or tamper with data to frustrate forensic analysis. Recently proposed tools enable administrators learn system states a partially network, but inherently unable (1) observe covert communication between nodes (2) detect attempts exfiltrate sensitive data. In this paper, we that emergence...
We describe PeerReview, a system that provides accountability in distributed systems. PeerReview ensures Byzantine faults whose effects are observed by correct node eventually detected and irrefutably linked to faulty node. At the same time, can always defend itself against false accusations. These guarantees particularly important for systems span multiple administrative domains, which may not trust each other.PeerReview works maintaining secure record of messages sent received The isused...
In this paper, we propose a new approach to diagnosing problems in complex distributed systems. Our is based on the insight that many of trickiest are anomalies. For instance, network, often affect only small fraction traffic (e.g., perhaps certain subnet), or they manifest infrequently. Thus, it quite common for operator have "examples" both working and non-working readily available – packet was misrouted, similar routed correctly. case, cause problem likely be wherever two packets were...
When debugging a distributed system, it is sometimes necessary to explain the absence of an event - for instance, why certain route not available, or packet did arrive. Existing debuggers offer some support explaining presence events, usually by providing equivalent backtrace in conventional debuggers, but they are very good at answering 'Why not?' questions: there simply no starting point possible backtrace.
Recently, a number of systems have been deployed that gather sensitive statistics from user devices while giving differential privacy guarantees. One prominent example is the component in Apple's macOS and iOS collects information about emoji usage new words. However, these criticized for making unrealistic assumptions, e.g., by creating very high "privacy budget" answering queries, replenishing this budget every day, which results worst-case loss. it not obvious whether such assumptions can...