- Software System Performance and Reliability
- Software Testing and Debugging Techniques
- Software Engineering Research
- Software Reliability and Analysis Research
- Cloud Computing and Resource Management
- Advanced Software Engineering Methodologies
- Parallel Computing and Optimization Techniques
- Quantum Computing Algorithms and Architecture
- Radiation Effects in Electronics
- Advanced Data Storage Technologies
- Software Engineering Techniques and Practices
- Scientific Computing and Data Management
- Machine Learning and Data Classification
- Explainable Artificial Intelligence (XAI)
- Artificial Intelligence in Healthcare and Education
- Electronic Health Records Systems
- Advanced Malware Detection Techniques
- Data Quality and Management
- Embedded Systems Design Techniques
- Advanced Proteomics Techniques and Applications
- Semantic Web and Ontologies
- Machine Learning in Healthcare
- Model-Driven Software Engineering Techniques
- Big Data and Business Intelligence
- Artificial Intelligence in Law
Simula Research Laboratory
2022-2025
Beihang University
2022
University of Zurich
2017-2021
Quantum computing (QC) promises polynomial and exponential speedups in many domains, such as unstructured search prime number factoring. However, quantum programs yield probabilistic outputs from exponentially growing distributions are vulnerable to quantum-specific faults. Existing software testing (QST) approaches treat superpositions classical distributions. This leads two major limitations when applied programs: (1) an sample space distribution (2) failing detect faults phase flips. To...
Ensuring that software performance does not degrade after a code change is paramount. A solution to regularly execute microbenchmarks, testing technique similar (functional) unit tests, which, however, often becomes infeasible due extensive runtimes. To address challenge, research has investigated regression techniques, such as test case prioritization (TCP), which reorder the execution within microbenchmark suite detect larger changes sooner. Such techniques are either designed for tests...
Continuous integration (CI) emphasizes quick feedback to developers. This is at odds with current practice of performance testing, which predominantely focuses on long-running tests against entire systems in production-like environments. Alternatively, software microbenchmarking attempts establish a baseline for small code fragments short time. paper investigates the quality microbenchmark suites focus suitability deliver and CI integration. We study ten open-source libraries written Java Go...
Performance regressions have a tremendous impact on the quality of software. One way to catch before they reach production is executing performance tests deployment, e.g., using microbenchmarks, which measure at subroutine level. In projects with many this may take several hours due repeated execution get accurate results, disqualifying them from frequent use in CI/CD pipelines. We propose $\mu$OpTime, static approach reduce time microbenchmark suites by configuring number repetitions for...
Performance regressions have a tremendous impact on the quality of software. One way to catch before they reach production is executing performance tests deployment, e.g., using microbenchmarks, which measure at subroutine level. In projects with many this may take several hours due repeated execution get accurate results, disqualifying them from frequent use in CI/CD pipelines. We propose µOpTime, static approach reduce time microbenchmark suites by configuring number repetitions for each...
Abstract Software benchmarks are only as good the performance measurements they yield. Unstable show high variability among repeated measurements, which causes uncertainty about actual and complicates reliable change assessment. However, if a benchmark is stable or unstable becomes evident after it has been executed its results available. In this paper, we introduce machine-learning-based approach to predict benchmark’s stability without having execute it. Our relies on 58...
Executing software microbenchmarks, a form of small-scale performance tests predominantly used for libraries and frameworks, is costly endeavor. Full benchmark suites take up to multiple hours or days execute, rendering frequent checks, e.g., as part continuous integration (CI), infeasible. However, altering configurations reduce execution time without considering the impact on result quality can lead results that are not representative software's true performance.
Automated test case generation is an effective technique to yield high-coverage suites. While the majority of research effort has been devoted satisfying coverage criteria, a recent trend emerged towards optimizing other non-coverage aspects. In this regard, runtime and memory usage are two essential dimensions: less expensive tests reduce resource demands for process later regression testing phases. This study shows that performance-aware requires solving main challenges: providing good...
Abstract Regression testing comprises techniques which are applied during software evolution to uncover faults effectively and efficiently. While regression is widely studied for functional tests, performance testing, e.g., with microbenchmarks, hardly investigated. Applying test case prioritization (TCP), a technique, microbenchmarks may help capturing large regressions sooner upon new versions. This especially be beneficial microbenchmark suites, because they take considerably longer...
Software performance changes are costly and often hard to detect pre-release. Similar software testing frameworks, either application benchmarks or microbenchmarks can be integrated into quality assurance pipelines before releasing a new version. Unfortunately, extensive benchmarking studies usually take several hours which is problematic when examining dozens of daily code in detail; hence, trade-offs have made. Optimized microbenchmark suites, only include small subset the full suite,...
The Cancer Registry of Norway (CRN) uses an automated cancer registration support system (CaReSS) to core registry activities, i.e, data capture, curation, and producing products statistics for various stakeholders. GURI is a component CaReSS, which responsible validating incoming with medical rules. Such rules are manually implemented by experts based on standards, regulations, research. Since large language models (LLMs) have been trained amount public information, including these...
Quantum computing promises polynomial and exponential speedups in many domains, such as unstructured search prime number factoring. However, quantum programs yield probabilistic outputs from exponentially growing distributions are vulnerable to quantum-specific faults. Existing software testing (QST) approaches treat superpositions classical distributions. This leads two major limitations when applied programs: (1) an sample space distribution (2) failing detect faults phase flips. To...
Performance problems in applications should ideally be detected as soon they occur, i.e., directly when the causing code modification is added to repository. To this end, complex and cost-intensive application benchmarks or lightweight but less relevant microbenchmarks can existing build pipelines ensure performance goals. In paper, we show how practical relevance of microbenchmark suites improved verified based on flow during an benchmark run. We propose approach determine overlap common...
The Cancer Registry of Norway (CRN) is a public body responsible for capturing and curating cancer patient data histories to provide unified access research statistics doctors, patients, policymakers. For this purpose, CRN develops operates complex, constantly-evolving, socio-technical software system. Recently, machine learning (ML) algorithms have been introduced into system augment the manual decisions made by humans with automated decision support from learned models. To ensure that...
The Cancer Registry of Norway (CRN) collects, curates, and manages data related to cancer patients in Norway, supported by an interactive, human-in-the-loop, socio-technical decision support software system. Automated testing this system is inevitable; however, currently, it limited CRN's practice. To end, we present industrial case study evaluate AI-based system-level tool, i.e., EvoMaster, terms its effectiveness In particular, focus on GURI, medical rule engine, which a key component at...
Degradation of software performance can become costly for companies and developers, yet it is hardly assessed continuously. A strategy that would allow continuous assessment libraries microbenchmarking, which faces problems such as excessive execution times unreliable results hinder wide-spread adoption in integration. In my research, I want to develop techniques including microbenchmarks into integration by utilizing cloud infrastructure time reduction techniques. These will assessing on...
The Cancer Registration Support System (CaReSS), built by the Registry of Norway (CRN), is a complex real-world socio-technical software system that undergoes continuous evolution in its implementation. Consequently, testing CaReSS with automated tools needed such dependability always ensured. Towards key subsystem CaReSS, i.e., GURI, we present application an extension to open-source tool EvoMaster, which automatically generates test cases evolutionary algorithms. We named EvoClass,...
Rigorous performance engineering traditionally assumes measuring on bare-metal environments to control for as many confounding factors possible. Unfortunately, some researchers and practitioners might not have access, knowledge, or funds operate dedicated testing hardware, making public clouds an attractive alternative. However, cloud are inherently unpredictable variable with respect their performance. In this study, we explore the effects of variability outcomes, what extent regressions...
The Cancer Registry of Norway (CRN) collects and processes cancer-related data for patients in Norway. For this, it employs a sociotechnical software system that evolves with changing requirements medical standards. current practice is to manually test CRN's prevent faults ensure its dependability. This paper focuses on automatically testing GURI, the rule engine, using system-level tool, EvoMaster, both black-box white-box modes, novel CRN-specific EvoMaster-based EvoGURI. We empirically...
Performance changes of software systems, and especially performance regressions, have a tremendous impact on users that system. Historical data can help developers to reason about how has changed over the course software's lifetime. In this demo paper we present two tools: hopper mine historical metrics based benchmarks unit tests, gopper analyse with respect changes.
Manually managing collaboration becomes a problem in distributed software engineering environments. Individual engineers easily loose track of who to involve and when. The result is lack communication, alternatively communication overload, leading errors rework. This paper presents Domain-Specific Language (DSL) for scripting structures their evolution. We demonstrate the DSL's benefits expressiveness setting up an iteration planning meeting agile development setting.