- Software Engineering Research
- Software Reliability and Analysis Research
- Machine Learning and Data Classification
- Data Mining Algorithms and Applications
- Privacy-Preserving Technologies in Data
- Software System Performance and Reliability
- Software Testing and Debugging Techniques
- Imbalanced Data Classification Techniques
- Historical and Linguistic Studies
- Archaeology and Historical Studies
- Biblical Studies and Interpretation
- Data Quality and Management
- Historical, Religious, and Philosophical Studies
- Islamic Studies and History
- Christian Theology and Mission
- Software Engineering Techniques and Practices
- Byzantine Studies and History
- Classical Philosophy and Thought
- Medieval and Classical Philosophy
- Religion, Society, and Development
- Machine Learning and Algorithms
- Mobile Crowdsensing and Crowdsourcing
- Data Stream Mining Techniques
- Privacy, Security, and Data Protection
- Web Application Security Vulnerabilities
University of Limerick
2015-2018
Lero
2015-2018
Morgantown High School
2014
West Virginia University
2009-2013
How can we find data for quality prediction? Early in the life cycle, projects may lack needed to build such predictors. Prior work assumed that relevant training was found nearest local project. But is this best approach? This paper introduces Peters filter which based on following conjecture: When scarce, more information exists other projects. Accordingly, selects via structure of To assess performance filter, compare it with two approaches prediction. Within-company learning and...
Background: Cross-company defect prediction (CCDP) is a field of study where an organization lacking enough local data can use from other organizations for building predictors. To support CCDP, must be shared. Such shared privatized, but that privatization could severely damage the utility data. Aim: enable effective while preserving privacy. Method: We explore algorithms maintain class boundaries in dataset. CLIFF instance pruner deletes irrelevant examples. MORPH mutator moves random...
The fundamental issue in cross project defect prediction is selecting the most appropriate training data for creating quality predictors. Another concern whether historical of open-source projects can be used to create predictors proprietary from a practical point-of-view. Current studies have proposed statistical approaches finding these data, however, thus far no apparent effort has been made study their success on data. Also methods apply brute force techniques which are computationally...
Security bug reports can describe security critical vulnerabilities in software products. Bug tracking systems may contain thousands of reports, where relatively few them are related. Therefore finding unlabelled bugs among be challenging. To help engineers identify these quickly and accurately, text-based prediction models have been proposed. These often mislabel due to a number reasons such as class imbalance, the ratio non-security is very high. More critically, we observed that presence...
Before a community can learn general principles, it must share individual experiences. Data sharing is the fundamental step of cross project defect prediction, i.e. process using data from one to predict for defects in another. Prior work on secure allowed owners their single-party basis prediction via minimization and obfuscation. However studied method did not consider that bigger required owner more data. In this paper, we extend previous with LACE2 which reduces amount shared by...
Ideally, we can learn lessons from software projects across multiple organizations. However, a major impediment to such knowledge sharing are the privacy concerns of development This paper aims provide defect data-set owners with an effective means privatizing their data prior release. We explore MORPH which understands how maintain class boundaries in data-set. is mutator that moves random distance, taking care not cross boundaries. The value training on this MORPHed tested via 10-way...
Before a community can learn general principles, it must share individual experiences. Data sharing is the fundamental step of cross project defect prediction, i.e. process using data from one to predict for defects in another. Prior work on secure allowed owners their single-party basis prediction via minimization and obfuscation. However studied method did not consider that bigger required owner more data. In this paper, we extend previous with LACE2 which reduces amount shared by...
Ideally, we can learn lessons from software projects across multiple organizations. However, a major impediment to such knowledge sharing are the privacy concerns of development This paper aims provide defect data-set owners with an effective means privatizing their data prior release. We explore MORPH which understands how maintain class boundaries in data-set. is mutator that moves random distance, taking care not cross boundaries. The value training on this MORPHed tested via 10-way...
Target audience: Software practitioners and researchers wanting to understand the state of art in using data science for software engineering (SE). Content: In age big data, (the knowledge deriving meaningful outcomes from data) is an essential skill that should be equipped by engineers. It can used predict useful information on new projects based completed projects. This tutorial offers core insights about state-of-the-art this important field. What participants will learn: Before science:...
This paper augments Boehm-Turner's model of agile and plan-based software development augmented with an AI search algorithm. The finds the key factors that predict for success or traditional developments. According to our simulations algorithm: (1) in no case did methods perform worse than approaches; (2) some cases, performed best. Hence, we recommend default practice organizations be method. simplicity this style analysis begs question: why is so much time wasted on evidence-less debates...
Using the tools of quantitative data science, software engineers that can predict useful information on new projects based past projects. This tutorial reflects state-of-the-art in reasoning this important field. discusses following: (a) when local is scarce, we show how to adapt from other organizations problems; (b) working with dubious quality, prune spurious information; (c) or models seem too complex, simplify mining results; (d) world changes, and old need be updated, handle those...
Using the tools of quantitative data science, software engineers that can predict useful information on new projects based past projects. This tutorial reflects state-of-the-art in reasoning this important field. discusses following: (a) when local is scarce, we show how to adapt from other organizations problems; (b) working with dubious quality, prune spurious information; (c) or models seem too complex, simplify mining results; (d) world changes, and old need be updated, handle those...
Smart cities offer a variety of services to provide citizens with efficient transport, water distribution, crime prevention, and traffic control. Such are personalized by automatically capturing, storing, processing personally identifiable data. The disclosure such data service provider raises privacy concerns for application users. As result, research has recognized the need aware in smart cities. In this paper we present PrivacyZones, awareness framework which requires share meaningful...