- Logic, programming, and type systems
- Formal Methods in Verification
- Software Testing and Debugging Techniques
- Software Engineering Research
- Security and Verification in Computing
- Parallel Computing and Optimization Techniques
- Advanced Malware Detection Techniques
- Embedded Systems Design Techniques
- Advanced Software Engineering Methodologies
- Web Application Security Vulnerabilities
- Teaching and Learning Programming
- Software System Performance and Reliability
- VLSI and Analog Circuit Testing
- Radiation Effects in Electronics
- Numerical Methods and Algorithms
- Distributed systems and fault tolerance
- Model-Driven Software Engineering Techniques
- Advanced Computational Techniques and Applications
- Data Visualization and Analytics
- Educational Games and Gamification
- Digital Filter Design and Implementation
- Software Engineering Techniques and Practices
- Distributed and Parallel Computing Systems
- Scientific Computing and Data Management
- Video Analysis and Summarization
University of California, San Diego
2016-2025
Imperial College London
2024
Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento
2024
University of Lisbon
2024
UC San Diego Health System
2007-2012
University of Washington
2002-2006
Woods Hole Oceanographic Institution
2002-2003
In this paper, we present a new algorithm for partial program verification that runs in polynomial time and space. We are interested checking satisfies given temporal safety property. Our insight is by accurately modeling only those branches which the property-related behavior differs along arms of branch, can design an accurate enough to verify with respect property, without paying potentially exponential cost full path-sensitive analysis.We have implemented "property simulation" as part...
Data races occur when multiple threads are about to access the same piece of memory, and at least one those accesses is a write. Such can lead hard-to-reproduce bugs that time consuming debug fix. We present RELAY, static scalable race detection analysis in which unsoundness modularized few sources. describe results from our experiments using RELAY find data Linux kernel, includes 4.5 million lines code.
Modern websites are powered by JavaScript, a flexible dynamic scripting language that executes in client browsers. A common paradigm such is to include third-party JavaScript code the form of libraries or advertisements. If this were malicious, it could read sensitive information from page write location bar, thus redirecting user malicious page, which entire machine be compromised. We present an information-flow based approach for inferring effects piece has on website order ensure key...
The dynamic nature of JavaScript web applications has given rise to the possibility privacy violating information flows. We present an empirical study prevalence such flows on a large number popular websites. have (1) designed expressive, fine-grained flow policy language that allows us specify and detect different kinds privacy-violating in code,(2) implemented new rewriting-based engine within Chrome browser, (3) used enhanced browser conduct large-scale over Alexa global top 50,000...
Optimizations in a traditional compiler are applied sequentially, with each optimization destructively modifying the program to produce transformed that is then passed next optimization. We present new approach for structuring phase of compiler. In our approach, optimizations take form equality analyses add information common intermediate representation. The optimizer works by repeatedly applying these infer equivalences between fragments, thus saturating representation equalities. Once...
We identify a timing channel in the floating point instructions of modern x86 processors: running time addition and multiplication can vary by two orders magnitude depending on their operands. develop benchmark measuring variability operations report its results. use data to demonstrate practical attacks security Fire fox browser (versions 23 through 27) Fuzz differentially private database. Finally, we initiate study mitigations channels with libfixedtimefixedpoint, new fixed-point,...
Several defenses have increased the cost of traditional, low-level attacks that corrupt control data, e.g. return addresses saved on stack, to compromise program execution. In response, creative adversaries begun circumventing these by exploiting programming errors manipulate pointers virtual tables, or vtables, C++ objects. These can hijack flow whenever a method corrupted object is called, potentially allowing attacker gain complete underlying system. this paper we present SAFEDISPATCH,...
We present Rhodium, a new language for writing compiler optimizations that can be automatically proved sound. Unlike our previous work on Cobalt, Rhodium expresses using explicit dataflow facts manipulated by local propagation and transformation rules. This style allows to mutually recursively defined, composed, interpreted in both flow-sensitive -insensitive ways, applied interprocedurally given separate context-sensitivity strategy, all while retaining soundness. also supports infinite...
Translation validation is a technique for checking that, after an optimization has run, the input and output of are equivalent. Traditionally, translation been used to prove concrete, fully specified programs In this paper we present Parameterized Equivalence Checking (PEC), generalization that can equivalence parameterized programs. A program partially represent multiple concrete For example, may contain section code whose only known property it does not modify certain variables. By proving...
AI-powered programming assistants are increasingly gaining popularity, with GitHub Copilot alone used by over a million developers worldwide. These tools far from perfect, however, producing code suggestions that may be incorrect in subtle ways. As result, face new challenge: validating AI's suggestions. This paper explores whether Live Programming (LP), continuous display of program's runtime values, can help address this challenge. To answer question, we built Python editor combines an...
We describe a technique for automatically proving compiler optimizations sound , meaning that their transformations are always semantics-preserving. first present domain-specific language, called Cobalt, implementing as guarded rewrite rules. Cobalt operate over C-like intermediate representation including unstructured control flow, pointers to local variables and dynamically allocated memory, recursive procedures. Then we the soundness of optimizations. Our requires an automatic theorem...
Linux distributions often include package management tools such as apt-get in Debian or yum RedHat. Using information about dependencies and conflicts, can determine how to install a new (and its dependencies) on system of already installed packages. off-the-shelf SAT solvers, pseudo-boolean Integer Linear Programming we have developed package-management tool, called Opium, that improves current two ways: (1) Opium is complete, if there solution, guaranteed find it, (2) optimize...
Integrated Development Environments (IDEs) have come to perform a wide variety of tasks on behalf the programmer, refactoring being classic example. These operations undeniable benefits, yet their large (and growing) number poses cognitive scalability problem. Our main contribution is WitchDoctor -- system that can detect, fly, when programmer hand-coding refactoring. The then complete in background and propose it user long before it. This implies technical challenges. algorithm must be 1)...
Integrated Development Environments (IDEs) have come to perform a wide variety of tasks on behalf the programmer, refactoring being classic example. These operations undeniable benefits, yet their large (and growing) number poses cognitive scalability problem. Our main contribution is WitchDoctor - system that can detect, fly, when programmer hand-coding refactoring. The then complete in background and propose it user long before it. This implies technical challenges. algorithm must be 1)...
Floating-point arithmetic plays a central role in science, engineering, and finance by enabling developers to approximate real arithmetic. To address numerical issues large floating-point applications, must identify root causes, which is difficult because errors are generally non-local, non-compositional, non-uniform.
Dataflow analyses can have mutually beneficial interactions. Previous efforts to exploit these interactions either (1) iteratively performed each individual analysis until no further improvements are discovered or (2) developed "super-analyses" that manually combine conceptually separate analyses. We devised a new approach allows be defined independently while still enabling them combined automatically and profitably. Our avoids the loss of precision associated with iterating implementation...
We describe a technique for automatically proving compiler optimizations sound, meaning that their transformations are always semantics-preserving. first present domain-specific language, called Cobalt, implementing as guarded rewrite rules. Cobalt operate over C-like intermediate representation including unstructured control flow, pointers to local variables and dynamically allocated memory, recursive procedures. Then we the soundness of optimizations. Our requires an automatic theorem...
Dataflow analyses for concurrent programs differ from their single-threaded counterparts in that they must account shared memory locations being overwritten by threads. Existing dataflow analysis techniques typically fall at either end of a spectrum: one end, the conservatively kills facts about all data might possibly be multiple threads; other precise thread-interleaving determines which may shared, and thus invalidated. The former approach can suffer imprecision, whereas latter does not scale.
We present a novel technique for verifying properties of data parallel GPU programs via test amplification. The key insight behind our work is that we can use the static information flow to amplify result single execution over set all inputs and interleavings affect property being verified. empirically demonstrate effectiveness amplification race-freedom determinism large number standard kernels, by showing dynamic be amplified massive space possible thread interleavings.
Despite decades of research on parsing, the construction parsers remains a painstaking, manual process prone to subtle bugs and pitfalls. We present programming-by-example framework called Parsify that is able synthesize parser from input/output examples. The user does not write single line code. To achieve this, provides: (a) an iterative algorithm for synthesizing refining grammar one example at time, (b) interface provides immediate visual feedback in response changes being refined, (c)...
Live programming is a regime in which the environment provides continual feedback, most often form of runtime values. In this paper, we present Projection Boxes, novel visualization technique for displaying values programs. The key idea behind projection boxes to start with full semantics program, and then use projections pick subset display. By varying used, can encode both previously known techniques, also new ones. As such, provide an expressive configurable framework information. Through...
Live programming is a paradigm in which the environment continually displays runtime values. Program synthesis technique that can generate programs or program snippets from examples. \deltextThis paper presents new called Synthesis-Aided Programming combines these two prior ideas synergistic way. When using Programming, programmers change values displayed by live \addtextPrevious works combine have taken holistic approach to way examples describe behavior of functions and programs. This...