- Software Engineering Research
- Open Source Software Innovations
- Software Reliability and Analysis Research
- Software Engineering Techniques and Practices
- Software System Performance and Reliability
- Advanced Malware Detection Techniques
- Software Testing and Debugging Techniques
- Scientific Computing and Data Management
- Online Learning and Analytics
- Quantum chaos and dynamical systems
- Chaos control and synchronization
- Wikis in Education and Collaboration
- Artificial Intelligence in Healthcare and Education
- COVID-19 diagnosis using AI
- Topic Modeling
- Mathematical Dynamics and Fractals
- Natural Language Processing Techniques
- Web Data Mining and Analysis
- Auction Theory and Applications
- Refrigeration and Air Conditioning Technologies
- Mobile Crowdsensing and Crowdsourcing
- Nonlinear Dynamics and Pattern Formation
- High-Velocity Impact and Material Behavior
- Game Theory and Applications
- Multimodal Machine Learning Applications
Shinshu University
2019-2025
Ōtani University
2020-2024
Nara Institute of Science and Technology
2013-2021
University of Waterloo
2021
University College London
2021
University of London
2021
Kagoshima University
1990-2020
Shiseido Group (Japan)
2019
Mahidol University
2019
National Archives and Records Administration
2014
Pseudo-code written in natural language can aid the comprehension of source code unfamiliar programming languages. However, great majority has no corresponding pseudo-code, because pseudo-code is redundant and laborious to create. If could be generated automatically instantly from given code, we allow for on-demand production without human effort. In this paper, propose a method generate specifically adopting statistical machine translation (SMT) framework. SMT, which was originally designed...
As a novel coronavirus swept the world in early 2020, thousands of software developers began working from home. Many did so on short notice, under difficult and stressful conditions.This study investigates effects pandemic developers' wellbeing productivity.A questionnaire survey was created mainly existing, validated scales translated into 12 languages. The data analyzed using non-parametric inferential statistics structural equation modeling.The received 2225 usable responses 53 countries....
Defect prediction models are proposed to help a team prioritize the areas of source code files that need Software Quality Assurance (SQA) based on likelihood having defects. However, developers may waste their unnecessary effort whole file while only small fraction its lines defective. Indeed, we find as little 1-3 percent Hence, in this work, propose novel framework (called <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Line-DP</small> )...
This paper introduces DevGPT, a dataset curated to explore how software developers interact with ChatGPT, prominent large language model (LLM).The encompasses 29,778 prompts and responses from including 19,106 code snippets, is linked corresponding development artifacts such as source code, commits, issues, pull requests, discussions, Hacker News threads.This comprehensive derived shared ChatGPT conversations collected GitHub News, providing rich resource for understanding the dynamics of...
There have been many bug prediction models built with historical metrics, which are mined from version histories of software modules. Many studies reported the effectiveness these metrics. For levels, most targeted package and file levels. Prediction on a fine-grained level, represents method is required because there may be interesting results compared to coarse-grained (package levels) prediction. These include good performance when considering quality assurance efforts, new findings about...
Bug reports are widely used in several research areas such as bug prediction, triaging, and etc. The performance of these studies relies on the information from reports. Previous study showed that a significant number actually misclassified between bugs non-bugs. However, classifying is time-consuming task. In previous study, researchers spent 90 days to classify manually more than 7,000 To tackle this problem, we propose automatic report classification techniques. We apply topic modeling...
There have been many bug prediction models built with historical metrics, which are mined from version histories of software modules. Many studies reported the effectiveness these metrics. For levels, most targeted package and file levels. Prediction on a fine-grained level, represents method is required because there may be interesting results compared to coarse-grained (package levels) prediction. These include good performance when considering quality assurance efforts, new findings about...
Links are an essential feature of the World Wide Web, and source code repositories no exception. However, despite their many undisputed benefits, links can suffer from decay, insufficient versioning, lack bidirectional traceability. In this paper, we investigate role contained in comments these perspectives. We conducted a large-scale study around 9.6 million to establish prevalence, used mixed-methods approach identify links' targets, purposes, evolutionary aspects. found that prevalent...
Implicit gender bias in software development is a well-documented issue, such as the association of technical roles with men. To address this bias, it important to understand more detail. This study uses data mining techniques investigate extent which 56 tasks related development, assigning GitHub issues and testing, are affected by implicit embedded large language models. We systematically translated each task from English into genderless back, investigated pronouns associated task. Based...
Previous studies have found that a significant number of bug reports are misclassified between bugs and nonbugs, manually classifying is time-consuming task. To address this problem, we propose classification model with N-gram IDF, theoretical extension Inverse Document Frequency (IDF) for handling words phrases any length. IDF enables us to extract key terms length from texts, these can be used as the features classify reports. We build models logistic regression random forest using topic...
The importance of supporting test and maintenance activities in software development has been increasing, since recent systems have become large complex. Although the field Mining Software Repositories (MSR) there are many promising approaches to predicting, localizing, triaging bugs, most them do not consider impacts each bug on users developers but rather treat all bugs with equal weighting, excepting a few studies high impact including security, performance, blocking, so forth. To make...
Papua New Guinea (PNG) is an emerging tech society with opportunity to overcome geographic and social boundaries, in order engage the global market. However, current landscape, dominated by Big Tech Silicon Valley other multinational companies Global North, tends overlook requirements of economies such as PNG. This becoming more obvious issues algorithmic bias (in product deployments) digital divide (as case non-affordable commercial software) are affecting PNG users. The Open Source...
Abstract Automatic identification of the differences between two versions a file is common and basic task in several applications mining code repositories. Git, version control system, has diff utility users can select algorithms from default algorithm Myers to advanced Histogram algorithm. From our systematic mapping, we identified three popular recent studies. On impact on churn metrics 14 Java projects, obtained different values 1.7% 8.2% commits based algorithms. Regarding...
Bug fixing is generally a manually-intensive task. However, recent work has proposed the idea of automated program repair, which aims to repair (at least subset of) bugs in different ways such as code mutation, etc. Following same line bug this paper we aim leverage past fixes propose current/future bugs. Specifically, Ratchet, corrective patch generation system using neural machine translation. By learning corresponding pre-correction and post-correction with sequence-to-sequence model,...
Abstract Self-admitted technical debt refers to situations where a software developer knows that their current implementation is not optimal and indicates this using source code comment. In work, we hypothesize it possible develop automated techniques understand subset of these comments in more detail, propose tool support can help developers manage self-admitted effectively. Based on qualitative study 333 indicating debt, first identify one particular class amenable management: on-hold...
Abstract Discussions is a new feature of GitHub for asking questions or discussing topics outside specific Issues Pull Requests. Before being available to all projects in December 2020, it had been tested on selected open source software projects. To understand how developers use this novel feature, they perceive it, and impacts the development processes, we conducted mixed-methods study based early adopters discussions from January until July 2020. We found that: (1) errors, unexpected...
Software systems are changed continuously for adapting to the environment, correcting faults, improving performance, and so on. For in-depth analysis related software evolution, it is informative obtain histories of fine-grained source code entities. This paper presents a tool named Historage that can provide entire fine grained entities in Java, such as methods, constructors, fields, etc. A characteristic ability tracing entity including renaming changes. We applied our technique five open...
We propose a sentiment classification method with general machine-learning framework. In comparison to publicly available data sets, our achieved the highest F1 values in positive and negative sentences on all sets.
GitHub's Copilot for Pull Requests (PRs) is a promising service aiming to automate various developer tasks related PRs, such as generating summaries of changes or providing complete walkthroughs with links the relevant code. As this innovative technology gains traction in Open Source Software (OSS) community, it crucial examine its early adoption and impact on development process. Additionally, offers unique opportunity observe how developers respond when they disagree generated content. In...