David Bowes

ORCID: 0000-0001-7014-2811
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Software Engineering Research
  • Software Reliability and Analysis Research
  • Software Testing and Debugging Techniques
  • Software System Performance and Reliability
  • Software Engineering Techniques and Practices
  • Advanced Malware Detection Techniques
  • Imbalanced Data Classification Techniques
  • Open Source Software Innovations
  • Neural dynamics and brain function
  • Advanced Software Engineering Methodologies
  • Topic Modeling
  • Natural Language Processing Techniques
  • Receptor Mechanisms and Signaling
  • Financial Reporting and Valuation Research
  • Consumer Market Behavior and Pricing
  • Knowledge Management and Sharing
  • Neuroscience and Neuropharmacology Research
  • Advanced Memory and Neural Computing
  • Consumer Retail Behavior Studies
  • Network Security and Intrusion Detection
  • CCD and CMOS Imaging Sensors
  • Artificial Intelligence in Healthcare
  • Anomaly Detection Techniques and Applications
  • Industrial Vision Systems and Defect Detection
  • Advanced Data Processing Techniques

Lancaster University
2019-2024

University of Hertfordshire
2009-2018

University of Central Lancashire
2018

Brunel University of London
2017

Background: The accurate prediction of where faults are likely to occur in code can help direct test effort, reduce costs, and improve the quality software. Objective: We investigate how context models, independent variables used, modeling techniques applied influence performance fault models. Method: used a systematic literature review identify 208 studies published from January 2000 December 2010. synthesize quantitative qualitative results 36 which report sufficient contextual...

10.1109/tse.2011.103 article EN IEEE Transactions on Software Engineering 2011-10-11

Background. The ability to predict defect-prone software components would be valuable. Consequently, there have been many empirical studies evaluate the performance of different techniques endeavouring accomplish this effectively. However no one technique dominates and so designing a reliable defect prediction model remains problematic. Objective. We seek make sense conflicting experimental results understand which factors largest effect on predictive performance. Method. conduct...

10.1109/tse.2014.2322358 article EN IEEE Transactions on Software Engineering 2014-06-01

During the last 10 years, hundreds of different defect prediction models have been published. The performance classifiers used in these is reported to be similar with rarely performing above predictive ceiling about 80% recall. We investigate individual defects that four predict and analyse level uncertainty produced by classifiers. perform a sensitivity analysis compare Random Forest, Naïve Bayes, RPart SVM when predicting NASA, open source commercial datasets. predictions each classifier...

10.1007/s11219-016-9353-3 article EN cc-by Software Quality Journal 2017-02-07

We investigate the relationship between faults and five of Fowler et al.'s least-studied smells in code: Data Clumps, Switch Statements, Speculative Generality, Message Chains, Middle Man. developed a tool to detect these three open-source systems: Eclipse, ArgoUML, Apache Commons. collected fault data from change repositories each system. built Negative Binomial regression models analyse relationships report McFadden effect size those relationships. Our results suggest that Statements had...

10.1145/2629648 article EN ACM Transactions on Software Engineering and Methodology 2014-09-05

Background: The NASA Metrics Data Program data sets have been heavily used in software defect prediction experiments.Aim: To demonstrate and explain why these require significant pre-processing order to be suitable for prediction.Method: A meticulously documented cleansing process involving all 13 of the original sets.Results: Post our novel process; each had between 6 90 percent less their number recorded values.Conclusions: One: Researchers need analyse that forms basis findings context...

10.1049/ic.2011.0012 article EN 2011-01-01

A key to the success of automatic program repair (APR) techniques is how easily they can be used in an industrial setting. In this article, we describe a collaboration by team from four U.K.-based universities with Bloomberg (London) implementing automatic, highquality fixes its code base. We explain motivation for adopting APR, mechanics prototype tool that was built, and practicalities integrating APR into existing systems.

10.1109/ms.2021.3071086 article EN IEEE Software 2021-04-05

Background: The NASA metrics data program (MDP) sets have been heavily used in software defect prediction research. Aim: To highlight the quality issues present these sets, and problems that can arise when they are a binary classification context. Method: A thorough exploration of all 13 original followed by various experiments demonstrating potential impact duplicate points mining. Conclusions: Firstly researchers need to analyse forms basis their findings context how it will be used....

10.1049/iet-sen.2011.0132 article EN IET Software 2012-10-25

We introduce mutation-aware fault prediction, which leverages additional guidance from metrics constructed in terms of mutants and the test cases that cover detect them. report results 12 sets experiments, applying 4 different predictive modelling techniques to 3 large real-world systems (both open closed source). The show our proposal can significantly (p ≤ 0.05) improve prediction performance. Moreover, mutation-based lie top 5% most frequently relied upon predictors 10 provide majority...

10.1145/2931037.2931039 article EN 2016-07-07

Background: The NASA datasets have previously been used extensively in studies of software defects. In 2013 Shepperd et al. presented an essential set rules for removing erroneous data from the making this more reliable to use.

10.1145/2915970.2916007 article EN 2016-05-24

Background: Ensemble techniques have gained attention in various scientific fields. Defect prediction researchers investigated many state-of-the-art ensemble models and concluded that cases these outperform standard single classifier techniques. Almost all previous work using defect rely on the majority voting scheme for combining outputs, implicit diversity among classifiers. Aim: Investigate whether can be improved an explicit technique with stacking ensemble, given fact different...

10.1145/2961111.2962610 article EN 2016-09-08

There are many hundreds of fault prediction models published in the literature. The predictive performance these is often reported using a variety different measures. Most measures not directly comparable. This lack comparability means that it difficult to evaluate one model against another. Our aim present an approach allows other researchers and practitioners transform categorical studies back into confusion matrix. Once expressed matrix alternative preferred can then be derived. has...

10.1145/2365324.2365338 article EN 2012-09-21

In this study, we analyzed issues and comments on GitHub projects built collaboration networks dividing contributors into two categories: users commenters. We identified as commenters those who only post without posting any nor committing changes in the source code. Since previous studies showed that there is a link between positive environment (regarding affectiveness) productivity, our goal was to investigate commenters' contribution project concerning affectiveness.

10.1145/3194932.3194936 article EN 2018-06-02

Background: Systematic literature reviews are increasingly used in software engineering. Most systematic require several hundred papers to be examined and assessed. This is not a trivial task can time consuming error-prone. Aim: We present SLuRp - our open source web enabled database that supports the management of reviews.

10.1145/2372233.2372243 article EN 2012-09-22

Software defect prediction performance varies over a large range. Menzies suggested there is ceiling effect of 80% Recall [8]. Most the data sets used are highly imbalanced. This paper asks, what empirical using different datasets with varying levels imbalance on predictive performance? We use synthesised by previous meta-analysis 600 fault models and their results. Four model evaluation measures (the Mathews Correlation Coefficient (MCC), F-Measure, Precision Recall) compared to...

10.1145/2810146.2810150 article EN 2015-09-15

Background: Studies related to human factors in software engineering are providing insightful information on the emotional state of contributors and impact this has code. The open source development paradigm involves different roles, previous studies about emotions have not taken into account what roles might play when people express their feelings. Aim: We present an analysis issues commits five GitHub projects distinguishing between users developers, one-commit multi-commit developers....

10.1145/3273934.3273943 article EN 2018-10-10

Abstract Evolutionary coupling (EC) is defined as the implicit relationship between 2 or more software artifacts that are frequently changed together. Changing widely reported to be defect‐prone. In this study, we investigate effect of EC on defect proneness large industrial systems and explain why effects vary. We analysed systems: a legacy financial system modern telecommunications system. collected historical data for 7 years from 5 different repositories containing 176 thousand files....

10.1002/smr.1842 article EN cc-by Journal of Software Evolution and Process 2017-02-07

Automatic program repair (APR) is a rapidly advancing field of software engineering that aims to supplement or replace manual bug fixing with an automated tool. For APR be successfully adopted in industry, it vital tools respond developer needs and preferences. However, very little research has considered developers' general attitudes current practices (the activity replace). This paper responds this gap by reporting on survey 386 developers about their finding experiences, instinctive...

10.1109/tse.2022.3194188 article EN cc-by IEEE Transactions on Software Engineering 2022-07-27

A systematic review of the research literature on fault-prediction models from 2000 through 2010 identified 36 studies that sufficiently defined their and development context methodology. The authors quantitatively analyzed 19 these 206 they presented. They several key features to help industry software developers build or optimize suitable specific contexts.

10.1109/ms.2011.138 article EN IEEE Software 2011-10-21

The aim of this paper is to investigate the quality methodology in software fault prediction studies using machine learning. Over two hundred have been published last 10 years. There evidence suggest that used some these does not allow us confidence predictions reported by them. We evaluate learning 21 studies. All use NASA data sets. score each study from 1 terms their (e.g. whether or report randomising cross validation folds). Only out scored 5 more 10. Furthermore only When we plot...

10.1109/icmla.2012.226 article EN 2012-12-01

Background: Test quality is a prerequisite for achieving production system quality. While the concept of multidimensional, most effort in testing context has been channelled towards measuring test effectiveness. Objective: effectiveness tests certainly important, we aim to identify core list principles that also address other facets testing, and discuss how they can be quantified as indicators Method: We have conducted two-day workshop with our industry partners come up relevant best...

10.1109/wetsom.2017.2 article EN 2017-05-01
Coming Soon ...