- Software Engineering Research
- Topic Modeling
- Software System Performance and Reliability
- Advanced Malware Detection Techniques
- Advanced Software Engineering Methodologies
- Software Testing and Debugging Techniques
- Software Engineering Techniques and Practices
- Web Data Mining and Analysis
- Software Reliability and Analysis Research
- Natural Language Processing Techniques
- Web Application Security Vulnerabilities
- Open Source Software Innovations
- Scientific Computing and Data Management
- Service-Oriented Architecture and Web Services
- Expert finding and Q&A systems
- Explainable Artificial Intelligence (XAI)
- Ethics and Social Impacts of AI
- Artificial Intelligence in Healthcare and Education
- Video Analysis and Summarization
- Model-Driven Software Engineering Techniques
- Mobile Crowdsensing and Crowdsourcing
- Multimodal Machine Learning Applications
- Information and Cyber Security
- Network Security and Intrusion Detection
- Privacy, Security, and Data Protection
Data61
2020-2025
Commonwealth Scientific and Industrial Research Organisation
2020-2025
Australian National University
2017-2025
Monash University
2019-2024
City University of Hong Kong
2024
Singapore Management University
2023
Fudan University
2018
Nanyang Technological University
2012-2017
National University of Singapore
2010-2013
University of Alberta
2003-2008
This paper presents UMLDiff, an algorithm for automatically detecting structural changes between the designs of subsequent versions object-oriented software. It takes as input two class models a Java software system, reverse engineered from corresponding code versions. produces output change tree, i.e., tree changes, that reports differences design in terms (a) additions, removals, moves, renamings packages, classes, interfaces, fields and methods, (b) to their attributes, (c) dependencies...
During software development and maintenance, developers spend a considerable amount of time on program comprehension activities. Previous studies show that takes up as much half developer's time. However, most these are performed in controlled setting, or with small number participants, investigate the activities only within IDEs. developers' go well beyond their IDE interactions. In this paper, we extend our ActivitySpace framework to collect analyze Human-Computer Interaction (HCI) data...
Text categorization, or text classification, is one of key tasks for representing the semantic information documents. Multi-label categorization finer-grained approach to which consists assigning multiple target labels It more challenging compared task multi-class due exponential growth label combinations. Existing approaches multi-label fall short extract local and model correlations. In this paper, we propose an ensemble application convolutional recurrent neural networks capture both...
Commit messages can be regarded as the documentation of software changes. These describe content and purposes changes, hence are useful for program comprehension maintenance. However, due to lack time direct motivation, commit sometimes neglected by developers. To address this problem, Jiang et al. proposed an approach (we refer it NMT), which leverages a neural machine translation algorithm automatically generate short from code. The reported performance their is promising, however, they...
Developers often need to search for appropriate APIs their programming tasks. Although most libraries have API reference documentation, it is not easy find due the lexical gap and knowledge between natural language description of task in documentation. Here, refers fact that same semantic meaning can be expressed by different words, documentation mainly describes functionality structure but lacks other types information like concepts purposes, which are usually key description. In this...
A GUI skeleton is the starting point for implementing a UI design image. To obtain from image, developers have to visually understand elements and their spatial layout in then translate this understanding into proper components compositions. Automating visual translation would be beneficial bootstraping mobile implementation, but it challenging task due diversity of designs complexity skeletons generate. Existing tools are rigid as they depend on heuristically-designed generation rules. In...
Recent breakthroughs in natural language processing (NLP) have permitted the synthesis and comprehension of coherent text an open-ended way, therefore translating theoretical algorithms into practical applications. The large models (LLMs) significantly impacted businesses such as report summarization software copywriters. Observations indicate, however, that LLMs may exhibit social prejudice toxicity, posing ethical societal dangers consequences resulting from irresponsibility. Large-scale...
Applications built on reusable component frameworks are subject to two independent, and potentially conflicting, evolution processes. The application evolves in response the specific requirements desired qualities of application's stakeholders. On other hand, framework is driven by need improve functionality quality while maintaining its generality. Thus, changes frequently change API which client applications rely and, as a result, these break. To date, there has been some work aimed at...
Software vulnerabilities pose significant security risks to the host computing system. Faced with continuous disclosure of software vulnerabilities, system administrators must prioritize their efforts, triaging most critical address first. Many vulnerability scoring systems have been proposed, but they all require expert knowledge determine intricate metrics. In this paper, we propose a deep learning approach predict multi-class severity level using only description. Compared metrics,...
Consider a question and its answers in Stack Overflow as knowledge unit. Knowledge units often contain semantically relevant knowledge, thus linkable for different purposes, such duplicate questions, directly problem solving, indirectly related information. Recognizing classes of would support more targeted information needs when users search or explore the base. Existing methods focus on binary relatedness (i.e., not), are not robust to recognize semantic share few words common have lexical...
The prevalence of questions and answers on domain-specific Q&A sites like Stack Overflow constitutes a core knowledge asset for software engineering domain. Although search engines can return list relevant to user query some technical question, the abundance posts sheer amount information in them makes it difficult developers digest find most needed their questions. In this work, we aim help who want quickly capture key points several answer question before they read details posts. We...
Technical debt is a metaphor to reflect the tradeoff software engineers make between short-term benefits and long-term stability. Self-admitted technical (SATD), variant of debt, has been proposed identify that intentionally introduced during development, e.g., temporary fixes workarounds. Previous studies have leveraged human-summarized patterns (which represent n-gram phrases can be used SATD) or text-mining techniques detect SATD in source code comments. However, several characteristics...
According to the World Health Organization(WHO), it is estimated that approximately 1.3 billion people live with some forms of vision impairment globally, whom 36 million are blind. Due their disability, engaging these minority into society a challenging problem. The recent rise smart mobile phones provides new solution by enabling blind users' convenient access information and service for understanding world. Users can adopt screen reader embedded in operating systems read content each...
API documentation provides important knowledge about the functionality and usage of APIs. In this paper, we focus on caveats that developers should be aware in order to avoid unintended use an API. Our formative study Stack Overflow questions suggests are often scattered multiple documents, buried lengthy textual descriptions. These characteristics make less discoverable. When fail notice caveats, it is very likely cause some unexpected programming errors. propose natural language...
UI design is an integral part of software development. For many developers who do not have much experience, exposing them to a large database real-application designs can help quickly build up realistic understanding the space for feature and get inspirations from existing applications. However, keyword-based, image-similarity-based, component-matching-based methods cannot reliably find relevant high-fidelity in alike wireframe that sketch, face great variations designs. In this article, we...
Although AI is transforming the world, there are serious concerns about its ability to behave and make decisions responsibly. Many ethical regulations, principles, frameworks for responsible have been issued recently. However, they high level difficult put into practice. On other hand, most researchers focus on algorithmic solutions, while challenges actually crosscut entire engineering lifecycle components of systems. To close gap in operationalizing AI, this paper aims develop a roadmap...
The rapid growth of software supply chain attacks has attracted considerable attention to bill materials (SBOM). SBOMs are a crucial building block ensure the transparency chains that helps improve security. Although there significant efforts from academia and industry facilitate SBOM development, it is still unclear how practitioners perceive what challenges adopting in practice. Furthermore, existing SBOM-related studies tend be ad-hoc lack engineering focuses. To bridge this gap, we...
Software engineering social content, such as Q&A discussions on Stack Overflow, has become a wealth of information software engineering. This textual content is centered around software-specific entities, and their usage patterns, issues-solutions, alternatives. However, existing approaches to analyzing texts treat entities in the same way other thus cannot support recent advance entity-centric applications, direct answers knowledge graph. The first step towards enabling these applications...
Software development often requires knowledge beyond what developers already possess. In such cases, have to seek help from different sources of information. As a metacognitive skill, seeking influences software developers' efficiency and success in many situations. However, there has been little research provide systematic investigation the general process activities engineering human system factors affecting seeking. This paper reports our empirical study aiming fill this gap. Our includes...
Online communities like Dribbble and GraphicBurger allow GUI designers to share their design artwork learn from each other. These sharing platforms are important sources for inspiration, but our survey with suggests additional information needs unmet by existing platforms. First, need see the practical use of certain designs in real applications, rather than just artworks. Second, want not only overall also detailed components. Third, advanced search abilities (e.g., multi-facets search)...
Establishing API mappings between third-party libraries is a prerequisite step for library migration tasks. Manually establishing tedious due to the large number of APIs be examined. Having an automatic technique create database likely can significantly ease task. Unfortunately, existing techniques either adopt supervised learning mechanism that requires already-ported or functionality similar applications across major programming languages platforms, which are difficult come by arbitrary...
Graphical User Interface (GUI) elements detection is critical for many GUI automation and testing tasks. Acquiring the accurate positions classes of also very first step to conduct reverse engineering or perform testing. In this paper, we implement a Iterface Element Detection (UIED), toolkit designed provide user with simple easy-to-use platform achieve element detection. UIED integrates multiple methods including old-fashioned computer vision (CV) approaches deep learning models handle...
Third-party libraries are an integral part of many software projects. It often happens that developers need to find analogical can provide comparable features the they already familiar with. Existing methods limited by community-curated list libraries, blogs, or Q&A posts, which contain overwhelming out-of-date information. In this paper, we present a new approach recommend based on knowledge base mined from tags millions Stack Overflow questions. The novelty our is solve...