- Software Engineering Research
- Software Testing and Debugging Techniques
- Software Reliability and Analysis Research
- Advanced Malware Detection Techniques
- Software System Performance and Reliability
- Scientific Computing and Data Management
- Network Security and Intrusion Detection
- Software Engineering Techniques and Practices
- Open Source Software Innovations
- Topic Modeling
- Logic, programming, and type systems
- Natural Language Processing Techniques
- Parallel Computing and Optimization Techniques
- Machine Learning and Data Classification
- Security and Verification in Computing
- Advanced Software Engineering Methodologies
- Anomaly Detection Techniques and Applications
- Web Application Security Vulnerabilities
- Numerical Methods and Algorithms
- Evolutionary Algorithms and Applications
- Cloud Computing and Resource Management
- Semantic Web and Ontologies
- Machine Learning and Algorithms
- Peer-to-Peer Network Technologies
- Viral Infectious Diseases and Gene Expression in Insects
University College London
2015-2024
Google (United Kingdom)
2023-2024
Google (Canada)
2024
DeepMind (United Kingdom)
2023
University of California, Davis
2002-2014
University of California, San Diego
2009
University of California System
2002
Testing involves examining the behaviour of a system in order to discover potential faults. Given an input for system, challenge distinguishing corresponding desired, correct from potentially incorrect behavior is called "test oracle problem". Test automation important remove current bottleneck that inhibits greater overall test automation. Without automation, human has determine whether observed correct. The literature on oracles introduced techniques including modelling, specifications,...
Natural languages like English are rich, complex, and powerful. The highly creative graceful use of Tamil, by masters Shakespeare Avvaiyar, can certainly delight inspire. But in practice, given cognitive constraints the exigencies daily life, most human utterances far simpler much more repetitive predictable. In fact, these be very usefully modeled using modern statistical methods. This fact has led to phenomenal success approaches speech recognition, natural language translation,...
Descriptive names are a vital part of readable, and hence maintainable, code. Recent progress on automatically suggesting for local variables tantalizes with the prospect replicating that success method class names. However, methods classes is much more difficult. This because good need to be functionally descriptive, but such requires model goes beyond context. We introduce neural probabilistic language source code specifically designed naming problem. Our learns which semantically similar...
Natural languages like English are rich, complex, and powerful. The highly creative graceful use of Tamil, by masters Shakespeare Avvaiyar, can certainly delight inspire. But in practice, given cognitive constraints the exigencies daily life, most human utterances far simpler much more repetitive predictable. In fact, these be very usefully modeled using modern statistical methods. This fact has led to phenomenal success approaches speech recognition, natural language translation,...
Every programmer has a characteristic style, ranging from preferences about identifier naming to object relationships and design patterns. Coding conventions define consistent syntactic fostering readability hence maintainability. When collaborating, programmers strive obey project's coding conventions. However, one third of reviews changes contain feedback conventions, indicating that do not always follow them project members care deeply adherence. Unfortunately, are often unaware because...
We are now witnessing the rapid growth of decentralized source code management (DSCM) systems, in which every developer has her own repository. DSCMs facilitate a style collaboration work output can flow sideways (and privately) between collaborators, rather than always up and down publicly) via central Decentralization comes with both promise new data peril its misinterpretation. focus on git, very popular DSCM used high-profile projects. Decentralization, other features such as...
Automated program repair has shown promise for reducing the significant manual effort debugging requires. This paper addresses a deficit of earlier evaluations automated techniques caused by repairing programs and evaluating generated patches' correctness using same set tests. Since tests are an imperfect metric correctness, this type do not discriminate between correct patches that overfit available break untested but desired functionality. evaluates two well-studied tools, GenProg...
Natural languages like English are rich, complex, and powerful. The highly creative graceful use of Tamil, by masters Shakespeare Avvaiyar, can certainly delight inspire. But in practice, given cognitive constraints the exigencies daily life, most human utterances far simpler much more repetitive predictable. In fact, these be very usefully modeled using modern statistical methods. This fact has led to phenomenal success approaches speech recognition, natural language translation,...
Recent work on genetic-programming-based approaches to automatic program patching have relied the insight that content of new code can often be assembled out fragments already exist in base. This has been dubbed plastic surgery hypothesis; successful, well-known repair tools such as GenProg rest this hypothesis, but it never validated. We formalize and validate hypothesis empirically measure extent which raw material for changes actually exists projects. In paper, we mount a large-scale...
Dynamically typed languages such as JavaScript and Python are increasingly popular, yet static typing has not been totally eclipsed: now supports type annotations like TypeScript offer a middle-ground for JavaScript: strict superset of JavaScript, to which it transpiles, coupled with system that permits partially programs. However, cost: adding annotations, reading the added syntax, wrestling fix errors. Type inference can ease transition more statically code unlock benefits richer...
Large Language Models (LLM) are a new class of computation engines, "programmed" via prompt engineering. Researchers still learning how to best "program" these LLMs help developers. We start with the intuition that developers tend consciously and unconsciously collect semantics facts, from code, while working. Mostly shallow, simple facts arising quick read. For function, such might include parameter local variable names, return expressions, pre- post-conditions, basic control data flow, etc.
Automated transplantation would open many exciting avenues for software development: suppose we could autotransplant code from one system into another, entirely unrelated, system. This paper introduces a theory, an algorithm, and tool that achieve this. Leveraging lightweight annotation, program analysis identifies organ (interesting behavior to transplant); testing validates the exhibits desired during its extraction after implantation host. While do not claim automated is now solved...
The all-important goal of delivering better software at lower cost has led to a vital, enduring quest for ways find and remove defects efficiently accurately. To this end, two parallel lines research have emerged over the last years. Static analysis seeks using algorithms that process well-defined semantic abstractions code. Statistical defect prediction uses historical data estimate parameters statistical formulae modeling phenomena thought govern occurrence predict where are likely occur....
Type inference over partial contexts in dynamically typed languages is challenging. In this work, we present a graph neural network model that predicts types by probabilistically reasoning program's structure, names, and patterns. The uses deep similarity learning to learn TypeSpace — continuous relaxation of the discrete space how embed type properties symbol (i.e. identifier) into it. Importantly, our can employ one-shot predict an open vocabulary types, including rare user-defined ones....
What is a good workday for software developer? typical workday? We seek to answer these two questions learn how make days typical. Concretely, answering will help optimize development processes and select tools that increase job satisfaction productivity. Our work adds large body of research on developers spend their time. report the results from 5,971 responses professional at Microsoft, who reflected about what made workdays typical, self-reported they spent time various activities work....
Article Share on ConceptDoppler: a weather tracker for internet censorshipCCS '07: Proceedings of the 14th ACM conference Computer and communications securityOctober 2007 Pages 352–365https://doi.org/10.1145/1315245.1315290Online:28 October 2007Publication History 42citation1,097DownloadsMetricsTotal Citations42Total Downloads1,097Last 12 Months115Last 6 weeks7 Get Citation AlertsNew Alert added!This alert has been successfully added will be sent to:You notified whenever record that you have...
Inspection is a highly effective but costly technique for quality control. Most companies do not have the resources to inspect all code; thus accurate defect prediction can help focus available inspection resources. BugCache simple, elegant, award-winning scheme that "caches" files are likely contain defects [12]. In this paper, we evaluate utility of as tool focusing inspection, examine assumptions underlying with aim improving it, and finally compare it standard bug-prediction technique....
It is well-known that floating-point exceptions can be disastrous and writing exception-free numerical programs very difficult. Thus, it important to automatically detect such errors. In this paper, we present Ariadne, a practical symbolic execution system specifically designed implemented for detecting exceptions. Ariadne systematically transforms program explicitly check each exception triggering condition. symbolically executes the transformed using real arithmetic find candidate...
Uncertainty complicates early requirements and architecture decisions may expose a software project to significant risk. Yet architects lack support for evaluating uncertainty, its impact on risk, the value of reducing uncertainty before making critical decisions. We propose apply decision analysis multi-objective optimisation techniques provide such support. present systematic method allowing describe about alternatives stakeholders' goals; calculate consequences through Monte-Carlo...
JavaScript is growing explosively and now used in large mature projects even outside the web domain. also a dynamically typed language for which static type systems, notably Facebook's Flow Microsoft's TypeScript, have been written. What benefits do these systems provide? Leveraging project histories, we select fixed bug check out code just prior to fix. We manually add annotations buggy test whether TypeScript report an error on code, thereby possibly prompting developer fix before its...
To enhance developer productivity, all modern integrated development environments (IDEs) include code suggestion functionality that proposes likely next tokens at the cursor. While current IDEs work well for statically-typed languages, their reliance on type annotations means they do not provide same level of support dynamic programming languages as languages. Moreover, engines in propose expressions or multi-statement idiomatic code. Recent has shown language models can improve systems by...
Software has bugs, and fixing those bugs pervades the software engineering process. It is folklore that bug fixes are often buggy themselves, resulting in bad fixes, either failing to fix a or creating new bugs. To confirm this folklore, we explored databases of Ant, AspectJ, Rhino projects, found comprise as much 9% all Thus, detecting correcting important for improving quality reliability software. However, no prior work systematically considered problem, which paper introduces formalizes....