Marcel Böhme

ORCID: 0000-0002-4470-1824
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Software Testing and Debugging Techniques
  • Software Engineering Research
  • Software Reliability and Analysis Research
  • Advanced Malware Detection Techniques
  • Security and Verification in Computing
  • Software System Performance and Reliability
  • Manufacturing Process and Optimization
  • Adversarial Robustness in Machine Learning
  • Formal Methods in Verification
  • Web Application Security Vulnerabilities
  • Advanced Software Engineering Methodologies
  • Machine Learning and Algorithms
  • Information and Cyber Security
  • Logic, programming, and type systems
  • Cryptographic Implementations and Security
  • Regional Development and Environment
  • Medical Research and Treatments
  • Evolution and Genetic Dynamics
  • Peer-to-Peer Network Technologies
  • Music and Audio Processing
  • Species Distribution and Climate Change
  • Machine Learning and Data Classification
  • Real-Time Systems Scheduling
  • Risk and Safety Analysis
  • Teaching and Learning Programming

Max Planck Institute for Security and Privacy
2021-2025

Monash University
2018-2024

Association for Computing Machinery
2019-2021

Australian Regenerative Medicine Institute
2019-2020

National University of Singapore
2012-2018

Saarland University
2014-2015

Software (Germany)
2015

TU Dresden
2009

Existing Greybox Fuzzers (GF) cannot be effectively directed, for instance, towards problematic changes or patches, critical system calls dangerous locations, functions in the stack-trace of a reported vulnerability that we wish to reproduce. In this paper, introduce Directed Fuzzing (DGF) which generates inputs with objective reaching given set target program locations efficiently. We develop and evaluate simulated annealing-based power schedule gradually assigns more energy seeds are...

10.1145/3133956.3134020 article EN Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security 2017-10-27

Coverage-based Greybox Fuzzing (CGF) is a random testing approach that requires no program analysis. A new test generated by slightly mutating seed input. If the exercises and interesting path, it added to set of seeds; otherwise, discarded. We observe most tests exercise same few "high-frequency" paths develop strategies explore significantly more with number gravitating towards low-frequency paths. explain challenges opportunities CGF using Markov chain model which specifies probability...

10.1145/2976749.2978428 article EN Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security 2016-10-24

Coverage-based Greybox Fuzzing (CGF) is a random testing approach that requires no program analysis. A new test generated by slightly mutating seed input. If the exercises and interesting path, it added to set of seeds; otherwise, discarded. We observe most tests exercise same few "high-frequency" paths develop strategies explore significantly more with number gravitating towards low-frequency paths. explain challenges opportunities CGF using Markov chain model which specifies probability...

10.1109/tse.2017.2785841 article EN IEEE Transactions on Software Engineering 2017-12-21

Server fuzzing is difficult. Unlike simple command-line tools, servers feature a massive state space that can be traversed effectively only with well-defined sequences of input messages. Valid are specified in protocol. In this paper, we present AFLNET, the first greybox fuzzer for protocol implementations. existing fuzzers, AFLNET takes mutational approach and uses state-feedback to guide process. seeded corpus recorded message exchanges between server an actual client. No specification or...

10.1109/icst46399.2020.00062 article EN 2020-08-05

How to find security flaws in a protocol implementation without machine-readable specification of the protocol?Facing internet, implementations are particularly security-critical software systems where inputs must adhere specific structure and order that is often informally specified hundreds pages natural language (RFC).Without some version protocol, it difficult automatically generate valid test for its follow required order.It possible partially alleviate this challenge using mutational...

10.14722/ndss.2024.24556 article EN 2024-01-01

Research has produced many approaches to automatically locate, explain, and repair software bugs. But do these relate the way practitioners actually understand, fix bugs? To help answer this question, we have collected a dataset named DBGBENCH --- correct fault locations, bug diagnoses, patches of 27 real errors in open-source C projects that were consolidated from hundreds debugging sessions professional engineers. Moreover, shed light on entire process, constructing hypothesis submitting...

10.1145/3106237.3106255 article EN 2017-08-02

Many real-world programs take highly structured and very complex inputs. The automated testing of such is non-trivial. If the test input does not adhere to a specific file format, program returns parser error. For symbolic execution-based whitebox fuzzing corresponding error handling code becomes significant time sink. Too much spent in exploring too many paths leading trivial errors. Naturally, better functional part where failure with valid exposes deep real bugs program. In this paper, we...

10.1145/2970276.2970316 article EN 2016-08-25

Android testing tools generate sequences of input events to exercise the state space app-under-test. Existing search-based techniques systematically evolve a population event so as achieve certain objectives such maximal code coverage. The hope is that mutation fit leads generation even fitter sequences. However, evolution may be ineffective. Our key insight pertinent app states which contributed original sequence's fitness not reached by mutated sequence. path through truncated at point mutation.

10.1145/3377811.3380402 article EN 2020-06-27

In this paper, we take the fundamental perspective of fuzzing as a learning process. Suppose before fuzzing, know nothing about behaviors program P: What does it do? Executing first test input, learn how P behaves for input. next either observe same or discover new behavior. As such, each execution reveals "some amount" information P's behaviors. A classic measure is Shannon's entropy. Measuring entropy allows us to quantify much learned from generated input program. Within probabilistic...

10.1145/3368089.3409748 article EN 2020-11-08

Given a program where none of our fuzzers finds any bugs, how do we know which fuzzer is better? In practice, often look to code coverage as proxy measure effectiveness and consider the achieves more better one.

10.1145/3510003.3510230 article EN Proceedings of the 44th International Conference on Software Engineering 2022-05-21

Intuitively we know, some software errors are more complex than others. If the error can be fixed by changing one faulty statement, it is a simple error. The substantial fix must be, consider

10.1145/2610384.2628058 article EN 2014-07-11

We present counterintuitive results for the scalability of fuzzing. Given same non-deterministic fuzzer, finding bugs linearly faster requires more machines. For instance, with twice machines, we can find all known in half time. Yet, time exponentially every new bug want to 24 hours, might need Similarly coverage. With cover code faster, but uncovered only faster. In other words, re-discovering vulnerabilities is cheap expensive. This holds even under simplifying assumption no...

10.1145/3368089.3409729 article EN 2020-11-08

What you change is what fuzz! In an empirical study of all fuzzer-generated bug reports in OSSFuzz, we found that four every five bugs have been introduced by recent code changes. That is, 77% 23k are regressions. For a newly added project, there usually initial burst new at 2-3 per day. However, after burst, and weeding out most the existing bugs, still get constant rate 3-4 week. The can only be explained increasing regression rate. Indeed, probability reported (i.e., could identify...

10.1145/3460120.3484596 article EN Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security 2021-11-12

We study the relative efficiencies of random and systematic approaches to automated software testing. Using a simple but realistic set assumptions, we propose general model for testing define sampling strategies (R) (S <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> ) testing, where each is associated with cost: 1 c units time, respectively. The two most important goals are: (i) achieving in minimal time given degree confidence x...

10.1109/tse.2015.2487274 article EN IEEE Transactions on Software Engineering 2015-10-05

A fundamental challenge of software testing is the statistically well-grounded extrapolation from program behaviors observed during testing. For instance, a security researcher who has run fuzzer for week currently no means (1) to estimate total number feasible branches, given that only fraction been covered so far; (2) additional time required cover 10% more branches (or coverage achieved in one day, respectively); or (3) assess residual risk vulnerability exists when discovered. Failing...

10.1145/3210309 article EN ACM Transactions on Software Engineering and Methodology 2018-04-30

While fuzzing can be very costly, it has proven to a fundamental technique in uncovering bugs (often security related) many applications. A recent study on bug reports from OSS-Fuzz observed that code changes are responsible for 77% of all reported bugs, stressing the importance continuous testing. With increased adoption CI/CD practices software development, is only natural look effective ways incorporating into In this paper, we effectiveness fuzz testing pipelines with focus related and...

10.1109/sbft59156.2023.00015 article EN 2023-05-01

Large Language Models (LLMs) have shown tremendous promise in automated software engineering. In this paper, we investigate the opportunities of LLMs for automatic regression test generation programs that take highly structured, human-readable inputs, such as XML parsers or JavaScript interpreters. Concretely, explore following scenarios so far been difficult to automatically absence corresponding input grammars: $\bullet$ Bug finding. Given a code change (e.g., commit pull request), our...

10.48550/arxiv.2501.11086 preprint EN arXiv (Cornell University) 2025-01-19

10.1109/tse.2025.3535925 article EN cc-by IEEE Transactions on Software Engineering 2025-01-01

Changes often introduce program errors, and hence recent software testing literature has focused on generating tests which stress changes. In this paper, we argue that changes cannot be treated as isolated artifacts are stressed via testing. Instead, it is the complex dependency across multiple subtle errors. Furthermore, dependence structures, need to exercised expose such ensure they remain undiscovered even in well tested deployed software. We motivate our work based empirical evidence...

10.1145/2491411.2491430 article EN 2013-08-18

Detecting regression bugs in software evolution, analyzing side-channels programs and evaluating robustness deep neural networks (DNNs) can all be seen as instances of differential analysis, where the goal is to generate diverging executions program paths. Two are said if observable behavior differs, e.g., terms output, execution time, or (DNN) classification. The key challenge analysis simultaneously reason about multiple paths, often across variants.

10.1145/3377811.3380363 article EN 2020-06-27

Many protocol implementations are reactive systems, where the process is in continuous interaction with other processes and environment. If a bug can be exposed only certain state, fuzzer needs to provide specific sequence of events as inputs that would take into this state before manifested. We call these bugs "stateful" bugs. Usually, when we testing implementation, do not have detailed formal specification rely upon. Without knowledge protocol, it inherently difficult for discover such...

10.48550/arxiv.2204.02545 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Regression verification (RV) seeks to guarantee the absence of regression errors in a changed program version. This paper presents Partition-based Verification (PRV): an approach RV based on gradual exploration differential input partitions. A partition is subset common space two versions that serves as unit verification. Instead proving for complete at once, PRV verifies partitions manner. If interrupted, retains partial guarantees least explored crucial practice verifying can be...

10.5555/2486788.2486829 article EN International Conference on Software Engineering 2013-05-18

Regression verification (RV) seeks to guarantee the absence of regression errors in a changed program version. This paper presents Partition-based Verification (PRV): an approach RV based on gradual exploration differential input partitions. A partition is subset common space two versions that serves as unit verification. Instead proving for complete at once, PRV verifies partitions manner. If interrupted, retains partial guarantees least explored crucial practice verifying can be...

10.1109/icse.2013.6606576 article EN 2013 35th International Conference on Software Engineering (ICSE) 2013-05-01
Coming Soon ...