Andreas Zeller

ORCID: 0000-0003-4719-8803
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Software Engineering Research
  • Software Testing and Debugging Techniques
  • Software Reliability and Analysis Research
  • Advanced Malware Detection Techniques
  • Software System Performance and Reliability
  • Advanced Software Engineering Methodologies
  • Parallel Computing and Optimization Techniques
  • Scientific Computing and Data Management
  • Software Engineering Techniques and Practices
  • Security and Verification in Computing
  • Web Data Mining and Analysis
  • Formal Methods in Verification
  • Flexible and Reconfigurable Manufacturing Systems
  • Service-Oriented Architecture and Web Services
  • Natural Language Processing Techniques
  • Network Security and Intrusion Detection
  • Logic, programming, and type systems
  • Model-Driven Software Engineering Techniques
  • Business Process Modeling and Analysis
  • Advanced Data Storage Technologies
  • Topic Modeling
  • Data Mining Algorithms and Applications
  • Speech and dialogue systems
  • Web Application Security Vulnerabilities
  • Distributed systems and fault tolerance

Helmholtz Center for Information Security
2016-2025

Saarland University
2012-2022

University of Stuttgart
2015-2019

Institute of Automation
2017-2019

Software (Germany)
2002-2018

Institut für Automatisierung und Informatik
2018

Verband Deutscher Maschinen- und Anlagenbau
2018

Japan Science and Technology Agency
2018

Klinikum Saarbrücken
2016

Laboratoire Plasma et Conversion d'Energie
2010

Given some test case, a program fails. Which circumstances of the case are responsible for particular failure? The delta debugging algorithm generalizes and simplifies failing to minimal that still produces failure. It also isolates difference between passing case. In study, Mozilla Web browser crashed after 95 user actions. Our prototype implementation automatically simplified input three relevant Likewise, it 896 lines HTML single line caused study required 139 automated runs or 35 minutes...

10.1109/32.988498 article EN IEEE Transactions on Software Engineering 2002-01-01

We apply data mining to version histories in order guide programmers along related changes: "Programmers who changed these functions also changed...." Given a set of existing changes, the mined association rules 1) suggest and predict likely further 2) show up item coupling that is undetectable by program analysis, 3) can prevent errors due incomplete changes. After an initial change, our ROSE prototype correctly locations be changed; best predictive power obtained for changes software. In...

10.1109/tse.2005.72 article EN IEEE Transactions on Software Engineering 2005-06-01

As a software system evolves, programmers make changes that sometimes cause problems. We analyze CVS archives for fix-inducing ---changes lead to problems, indicated by fixes. show how automatically locate linking version archive (such as CVS) bug database BUGZILLA). In first investigation of the MOZILLA and ECLIPSE history, it turns out distinct patterns with respect their size day week they were applied.

10.1145/1082983.1083147 article EN ACM SIGSOFT Software Engineering Notes 2005-05-17

What is it that makes software fail? In an empirical study of the post-release defect history five Microsoft systems, we found failure-prone entities are statistically correlated with code complexity measures. However, there no single set metrics could act as a universally best predictor. Using principal component analysis on metrics, built regression models accurately predict likelihood defects for new entities. The approach can easily be generalized to arbitrary projects; in particular,...

10.1145/1134285.1134349 article EN Proceedings of the 44th International Conference on Software Engineering 2006-05-28

We have mapped defects from the bug database of eclipse (one largest open-source projects) to source code locations. The resulting data set lists number pre- and post-release for every package file in releases 2.0, 2.1, 3.0. additionally annotated with common complexity metrics. All is publicly available can serve as a benchmark defect prediction models.

10.1109/promise.2007.10 article EN 2007-05-01

Which is the defect that causes a software failure? By comparing program states of failing and passing run, we can identify state differences cause failure. However, these occur all over run. Therefore, focus in space on those variables values are relevant for failure, time moments where transitions occur---moments new begin being failure causes: "Initially, variable argc was 3; therefore, at shell_sort(), [2] 0, failed." In our evaluation, locate failure-inducing twice as well best methods...

10.1145/1062455.1062522 article EN 2005-01-01

Consider the execution of a failing program as sequence states. Each state induces following state, up to failure. Which variables and values are relevant for failure? We show how Delta Debugging algorithm isolates by systematically narrowing difference between passing run run---by assessing outcome altered executions determine wether change in makes test outcome. Applying multiple states automatically reveals cause-effect chain failure---that is, that caused failure.In case study, our...

10.1145/587051.587053 article EN 2002-11-18

We apply data mining to version histories in order guide programmers along related changes: Programmers who changed these functions also changed. . Given a set of existing changes, such rules (a) suggest and predict likely further (b) show up item coupling that is indetectable by program analysis, (c) prevent errors due incomplete changes. After an initial change, our ROSE prototype can correctly 26% files be - 15% the precise or variables. The topmost three suggestions contain correct...

10.5555/998675.999460 article EN International Conference on Software Engineering 2004-05-23

We analyze the version history of 7 software systems to predict most fault prone entities and files. The basic assumption is that faults do not occur in isolation, but rather bursts several related faults. Therefore, we cache locations are likely have faults: starting from location a known (fixed) fault, itself, any changed together with recently added locations, locations. By consulting at moment fixed, developer can detect fault-prone This useful for prioritizing verification validation...

10.1109/icse.2007.66 article EN Proceedings/Proceedings - International Conference on Software Engineering 2007-05-01

How do we know a program does what it claims to do? After clustering Android apps by their description topics, identify outliers in each cluster with respect API usage. A "weather" app that sends messages thus becomes an anomaly; likewise, "messaging" would typically not be expected access the current location. Applied on set of 22,500+ applications, our CHABADA prototype identified several anomalies; additionally, flagged 56% novel malware as such, without requiring any known patterns.

10.1145/2568225.2568276 article EN Proceedings of the 44th International Conference on Software Engineering 2014-05-20

Where do most vulnerabilities occur in software? Our Vulture tool automatically mines existing vulnerability databases and version archives to map past components. The resulting ranking of the vulnerable components is a perfect base for further investigations on what makes vulnerable.

10.1145/1315245.1315311 article EN 2007-10-28

Predicting the time and effort for a software problem has long been difficult task. We present an approach that automatically predicts fixing effort, i.e., person-hours spent on issue. Our technique leverages existing issue tracking systems: given new report, we use Lucene framework to search similar, earlier reports their average as prediction. thus allows early estimation, helping in assigning issues scheduling stable releases. evaluated our using data from JBoss project. Given sufficient...

10.1109/msr.2007.13 article EN 2007-05-01

As a software system evolves, programmers make changes that sometimes cause problems. We analyze CVS archives for fix-inducing changes---changes lead to problems, indicated by fixes. show how automatically locate linking version archive (such as CVS) bug database BUGZILLA). In first investigation of the MOZILLA and ECLIPSE history, it turns out distinct patterns with respect their size day week they were applied.

10.1145/1083142.1083147 article EN 2005-01-01

To assess the quality of test suites, mutation analysis seeds artificial defects (mutations) into programs; a nondetected indicates weakness in suite. We present an automated approach to generate unit tests that detect these mutations for object-oriented classes. This has two advantages: First, resulting suite is optimized toward finding modeled by operators rather than covering code. Second, state change caused induces oracles precisely mutants. Evaluated on 10 open source libraries, our...

10.1109/tse.2011.93 article EN IEEE Transactions on Software Engineering 2011-09-22

When interacting with version control systems, developers often commit unrelated or loosely related code changes in a single transaction. analyzing the history, such tangled will make all to modules appear related, possibly compromising resulting analyses through noise and bias. In an investigation of five open-source Java projects, we found up 15% bug fixes consist multiple changes. Using multi-predictor approach untangle changes, show that on average at least 16.6% source files are...

10.1109/msr.2013.6624018 article EN 2013-05-01

In a manual examination of more than 7,000 issue reports from the bug databases five open-source projects, we found 33.8% all to be misclassified - that is, rather referring code fix, they resulted in new feature, an update documentation, or internal refactoring. This misclassification introduces bias prediction models, confusing bugs and features: On average, 39% files marked as defective actually never had bug. We discuss impact this on earlier studies recommend data validation for future studies.

10.5555/2486788.2486840 article EN International Conference on Software Engineering 2013-05-18

In a manual examination of more than 7,000 issue reports from the bug databases five open-source projects, we found 33.8% all to be misclassified - that is, rather referring code fix, they resulted in new feature, an update documentation, or internal refactoring. This misclassification introduces bias prediction models, confusing bugs and features: On average, 39% files marked as defective actually never had bug. We discuss impact this on earlier studies recommend data validation for future studies.

10.1109/icse.2013.6606585 article EN 2013 35th International Conference on Software Engineering (ICSE) 2013-05-01

What is it that makes an app malicious? One important factor malicious apps treat sensitive data differently from benign apps. To capture such differences, we mined 2,866 Android applications for their flow sources, and compare these flows against those found in We find (a) every source, the ends up a small number of typical sinks; (b) sinks differ considerably between apps; (c) differences can be used to flag due abnormal flow; (d) identified by alone, without requiring known malware...

10.1109/icse.2015.61 article EN 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering 2015-05-01

Imagine some program and a number of changes. If none these changes is applied (“yesterday”), the works. all are (“today”), does not work. Which change responsible for failure? We present an efficient algorithm that determines minimal set failure-inducing Our delta debugging prototype tracked down single from 178,000 changed GDB lines within few hours.

10.1145/318774.318946 article EN ACM SIGSOFT Software Engineering Notes 1999-10-01

We apply data mining to version histories in order guide programmers along related changes: "Programmers who changed these functions also changed. . ". Given a set of existing changes, such rules (a) suggest and predict likely further (b) show up item coupling that is indetectable by program analysis, (c) prevent errors due incomplete changes. After an initial change, our ROSE prototype can correctly 26% files be - 15% the precise or variables. The topmost three suggestions contain correct...

10.1109/icse.2004.1317478 article EN Proceedings. 26th International Conference on Software Engineering 2004-09-28

Interacting with objects often requires following a protocol---for instance, specific sequence of method calls. These protocols are not always documented, and violations can lead to subtle problems. Our approach takes code examples automatically infer legal sequences The resulting patterns then be used detect anomalies such as Before calling next, one normally calls hasNext. To our knowledge, this is the first fully automatic defect detection that learns checks methodcall sequences. JADET...

10.1145/1287624.1287632 article EN 2007-09-07

In program debugging, finding a failing run is only the first step; what about correcting fault? Can we automate second task as well first? The AutoFix-E tool automatically generates and validates fixes for software faults. key insights behind are to rely on contracts present in ensure that proposed semantically sound, state diagrams using an abstract notion of based boolean queries class. Out 42 faults found by automatic testing two widely used Eiffel libraries, proposes successful 16...

10.1145/1831708.1831716 article EN 2010-07-12
Coming Soon ...