NFDI4DS | UHH-SEMS - Publication Details

Duplicate bug report detection with a combination of information retrieval and topic modeling

OPENALEX - Publications

Anh Tuan Nguyen Tung Thanh Nguyen Tien N. Nguyen David Lo C. P. Sun

Detecting duplicate bug reports helps reduce triaging efforts and save time for developers in fixing the same issues. Among several automated detection approaches, text-based information retrieval (IR) approaches have been shown to outperform others term of both accuracy efficiency. However, those IR-based do not detect well on technical issues written different descriptive terms.

10.1145/2351676.2351687 article EN 2012-09-03

A statistical semantic language model for source code

OPENALEX - Publications

Tung Thanh Nguyen Anh Tuan Nguyen Hoan Anh Nguyen Tien N. Nguyen

Recent research has successfully applied the statistical n-gram language model to show that source code exhibits a good level of repetition. The is shown have predictability in supporting suggestion and completion. However, state-of-the-art approach capture regularities/patterns based only on lexical information local context units. To improve predictability, we introduce SLAMC, novel semantic for code. It incorporates into tokens models such annotations, called sememes, rather than their...

10.1145/2491411.2491458 article EN 2013-08-18

A topic-based approach for narrowing the search space of buggy files from a bug report

OPENALEX - Publications

Anh Tuan Nguyen Tung Thanh Nguyen Jafar M. Al-Kofahi Hung Viet Nguyen Tien N. Nguyen

Locating buggy code is a time-consuming task in software development. Given new bug report, developers must search through large number of files project to locate code. We propose BugScout, an automated approach help reduce such efforts by narrowing the space when they are assigned address report. BugScout assumes that textual contents report and its corresponding source share some technical aspects system which can be used for locating given develop specialized topic model represents those...

10.1109/ase.2011.6100062 article EN 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2011-11-01

Combining Deep Learning with Information Retrieval to Localize Buggy Files for Bug Reports (N)

OPENALEX - Publications

An Ngoc Lam Anh Tuan Nguyen Hoan Anh Nguyen Tien N. Nguyen

Bug localization refers to the automated process of locating potential buggy files for a given bug report. To help developers focus their attention those is crucial. Several existing approaches from report face key challenge, called lexical mismatch, in which terms used reports describe are different and code tokens source files. This paper presents novel approach that uses deep neural network (DNN) combination with rVSM, an information retrieval (IR) technique. rVSM collects feature on...

10.1109/ase.2015.73 article EN 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2015-11-01

A graph-based approach to API usage adaptation

OPENALEX - Publications

Hoan Anh Nguyen Tung Thanh Nguyen Gary Wilson Anh Tuan Nguyen Miryung Kim and 1 more

Reusing existing library components is essential for reducing the cost of software development and maintenance. When evolve to accommodate new feature requests, fix bugs, or meet standards, clients libraries often need make corresponding changes correctly use updated libraries. Existing API usage adaptation techniques support simple such as replacing target calls a deprecated API, however, cannot handle complex adaptations creating object be passed different method, adding an exception...

10.1145/1869459.1869486 article EN 2010-10-17

Exploring API Embedding for API Usages and Applications

OPENALEX - Publications

Trong Duc Nguyen Anh Tuan Nguyen Hung Phan Tien N. Nguyen

Word2Vec is a class of neural network models that as being trainedfrom large corpus texts, they can produce for each unique word acorresponding vector in continuous space which linguisticcontexts words be observed. In this work, we study thecharacteristics vectors, called API2VEC or API embeddings, the elements within sequences source code. Ourempirical shows close proximity vectorsfor reflects similar usage contexts containing thesurrounding APIs those elements. Moreover, captureseveral...

10.1109/icse.2017.47 article EN 2017-05-01

API code recommendation using statistical learning from fine-grained changes

OPENALEX - Publications

Anh Tuan Nguyen Michael Hilton Mihai Codoban Hoan Anh Nguyen Lily Mast and 3 more

Learning and remembering how to use APIs is difficult. While code-completion tools can recommend API methods, browsing a long list of method names their documentation tedious. Moreover, users easily be overwhelmed with too much information. We present novel recommendation approach that taps into the predictive power repetitive code changes provide relevant recommendations for developers. Our tool, APIREC, based on statistical learning from fine-grained context in which those were made....

10.1145/2950290.2950333 article EN 2016-11-01

Graph-based statistical language model for code

OPENALEX - Publications

Anh Tuan Nguyen Tien N. Nguyen

n-gram statistical language model has been successfully applied to capture programming patterns support code completion and suggestion. However, the approaches using face challenges in capturing at higher levels of abstraction due mismatch between sequence nature n-grams structure syntax semantics source code. This paper presents GraLan, a graph-based its application GraLan can learn from corpus compute appearance probabilities any graphs given observed (sub)graphs. We use develop an API...

10.5555/2818754.2818858 article EN International Conference on Software Engineering 2015-05-16

A study of repetitiveness of code changes in software evolution

OPENALEX - Publications

Hoan Anh Nguyen Anh Tuan Nguyen Tung Thanh Nguyen Tien N. Nguyen Hridesh Rajan

In this paper, we present a large-scale study of repetitiveness code changes in software evolution. We collected large data set 2,841 Java projects, with 1.7 billion source lines (SLOC) at the latest revisions, 1.8 million change revisions (0.4 fixes), 6.2 changed files, and 2.5 SLOCs. A is considered repeated within or cross-project if it matches another having occurred history project project, respectively. report following important findings. First, could be as high 70-100% small sizes...

10.1109/ase.2013.6693078 article EN 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2013-11-01

Lexical statistical machine translation for language migration

OPENALEX - Publications

Anh Tuan Nguyen Tung Thanh Nguyen Tien N. Nguyen

Prior research has shown that source code also exhibits naturalness, i.e. it is written by humans and likely to be repetitive. The researchers showed the n-gram language model useful in predicting next token a file given large corpus of existing code. In this paper, we investigate how well statistical machine translation (SMT) models for natural languages could help migrating from one programming another. We treat as sequence lexical tokens apply phrase-based SMT on lexemes those tokens. Our...

10.1145/2491411.2494584 article EN 2013-08-18

Graph-Based Statistical Language Model for Code

OPENALEX - Publications

Anh Tuan Nguyen Tien N. Nguyen

n-gram statistical language model has been successfully applied to capture programming patterns support code completion and suggestion. However, the approaches using face challenges in capturing at higher levels of abstraction due mismatch between sequence nature n-grams structure syntax semantics source code. This paper presents GraLan, a graph-based its application GraLan can learn from corpus compute appearance probabilities any graphs given observed (sub)graphs. We use develop an API...

10.1109/icse.2015.336 article EN 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering 2015-05-01

Graph-based pattern-oriented, context-sensitive source code completion

OPENALEX - Publications

Anh Tuan Nguyen Tung Thanh Nguyen Hoan Anh Nguyen Ahmed Tamrawi Hung Viet Nguyen and 2 more

Code completion helps improve developers' programming productivity. However, the current support for code is limited to context-free templates or a single method call of variable on focus. Using software libraries development, developers often repeat API usages certain tasks. Thus, tool could make use usage patterns. In this paper, we introduce GraPacc, graph-based, pattern-oriented, context-sensitive approach that based database such GraPacc represents and manages patterns multiple...

10.5555/2337223.2337232 article EN International Conference on Software Engineering 2012-06-02

Statistical learning approach for mining API usage mappings for code migration

OPENALEX - Publications

Anh Tuan Nguyen Hoan Anh Nguyen Tung Thanh Nguyen Tien N. Nguyen

The same software product nowadays could appear in multiple platforms and devices. To address business needs, companies develop a programming language then migrate it to another one. support that process, semi-automatic migration tools have been proposed. However, they require users manually define the mappings between respective APIs of libraries used two languages. reduce such manual effort, we introduce StaMiner, novel data-driven approach statistically learns from corpus corresponding...

10.1145/2642937.2643010 article EN 2014-09-15

Graph-based pattern-oriented, context-sensitive source code completion

OPENALEX - Publications

Anh Tuan Nguyen Tung Thanh Nguyen Hoan Anh Nguyen Ahmed Tamrawi Hung Viet Nguyen and 2 more

Code completion helps improve developers' programming productivity. However, the current support for code is limited to context-free templates or a single method call of variable on focus. Using software libraries development, developers often repeat API usages certain tasks. Thus, tool could make use usage patterns. In this paper, we introduce GraPacc, graph-based, pattern-oriented, context-sensitive approach that based database such GraPacc represents and manages patterns multiple...

10.1109/icse.2012.6227205 article EN 2013 35th International Conference on Software Engineering (ICSE) 2012-06-01

Multi-layered approach for recovering links between bug reports and fixes

OPENALEX - Publications

Anh Tuan Nguyen Tung Thanh Nguyen Hoan Anh Nguyen Tien N. Nguyen

The links between the bug reports in an issue-tracking system and corresponding fixing changes a version repository are not often recorded by developers. Such linking information is crucial for research mining software repositories measuring defects maintenance efforts. However, state-of-the-art bug-to-fix link recovery approaches still rely much on textual matching commit/change logs cannot handle well cases where their contents textually similar.

10.1145/2393596.2393671 article EN 2012-11-11

Divide-and-Conquer Approach for Multi-phase Statistical Migration for Source Code (T)

OPENALEX - Publications

Anh Tuan Nguyen Tung Thanh Nguyen Tien N. Nguyen

Prior research shows that directly applying phrase-based SMT on lexical tokens to migrate Java C# produces much semantically incorrect code. A key limitation is the use of sequences in model and translate source code with well-formed structures. We propose mppSMT, a divide-and-conquer technique address novel training migration algorithms using three phases. First, mppSMT treats program as sequence syntactic units maps/translates such two languages one another. Second, syntax-directed...

10.1109/ase.2015.74 article EN 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE) 2015-11-01

Extended Unsteady Vortex-Lattice Method for Insect Flapping Wings

OPENALEX - Publications

Anh Tuan Nguyen Joong-Kwan Kim Jong-Seob Han Jae‐Hung Han

An extended unsteady vortex-lattice method is developed to study the aerodynamics of insect flapping wings while hovering and during forward flight. Leading-edge suction analogy vortex-core growth models are used as an extension, which incorporated into a conventional in effort overcome challenges that arise when simulating such wing–wake interaction leading-edge effects. A convergence analysis was carried out derive optimal aerodynamic mesh time-step size for flapping-wing models. parallel...

10.2514/1.c033456 article EN Journal of Aircraft 2016-05-24

T2API: synthesizing API code usage templates from English texts with statistical translation

OPENALEX - Publications

Thanh V. Nguyen Peter C. Rigby Anh Tuan Nguyen Mark Karanfil Tien N. Nguyen

In this work, we develop T2API, a statistical machine translation-based tool that takes given English description of programming task as query, and synthesizes the API usage template for by learning from training data. T2API works in two steps. First, it derives elements relevant to described input statistically StackOverflow corpus text descriptions corresponding code. To infer those elements, also considers context words textual often go together corpus. The inferred with their relevance...

10.1145/2950290.2983931 article EN 2016-11-01

Wing flexibility effects on the flight performance of an insect-like flapping-wing micro-air vehicle

OPENALEX - Publications

Anh Tuan Nguyen Jae‐Hung Han

10.1016/j.ast.2018.06.007 article EN Aerospace Science and Technology 2018-06-06

Migrating code with statistical machine translation

OPENALEX - Publications

Anh Tuan Nguyen Tung Thanh Nguyen Tien N. Nguyen

In the era of mobile computing, developers often need to migrate code written for one platform in a programming language another different platform, e.g., from Java Android C# Windows Phone. The migration process is performed manually or semi-automatically, which are required define translation rules and API mappings. This paper presents semSMT, an automatic tool C#. semSMT utilizes statistical machine automatically infer existing migrated code, thus, requires no manual defining rules. video...

10.1145/2591062.2591072 article EN 2014-05-20

Mapping API elements for code migration with vector representations

OPENALEX - Publications

Trong Duc Nguyen Anh Tuan Nguyen Tien N. Nguyen

Problem. Code migration between languages is challenging partly because different require developers to use software libraries and frameworks. For example, in Java, Java Development Kit library (JDK) a popular toolkit while .NET the main framework used C# development. requires not only mappings language constructs (e.g., statements, expressions) but also among APIs of libraries/frameworks two languages. write file, one can FileWriter.write FileWriter, C#, achieve same function with...

10.1145/2889160.2892661 article EN 2016-05-14

Statistical learning of API fully qualified names in code snippets of online forums

OPENALEX - Publications

Hung Phan Hoan Anh Nguyen Ngoc Mai Tran Linh H. Truong Anh Tuan Nguyen and 1 more

Software developers often make use of the online forums such as StackOverflow (SO) to learn how software libraries and their APIs. However, code snippets in a forum contain undeclared, ambiguous, or largely unqualified external references. Such declaration ambiguity reference present challenges for learning correctly In this paper, we propose StatType, statistical approach resolve fully qualified names (FQNs) API elements snippets. Unlike existing approaches that are based on heuristics,...

10.1145/3180155.3180230 article EN Proceedings of the 44th International Conference on Software Engineering 2018-05-27

A deep neural network language model with contexts for source code

OPENALEX - Publications

Anh Tuan Nguyen Trong Duc Nguyen Hung Phan Tien N. Nguyen

Statistical language models (LMs) have been applied in several software engineering applications. However, they issues dealing with ambiguities the names of program and API elements (classes method calls). In this paper, inspired by success Deep Neural Network (DNN) natural processing, we present Dnn4C, a DNN model that complements local context lexical code both syntactic type contexts. We designed context-incorporating to use annotations for source order learn distinguish tokens different...

10.1109/saner.2018.8330220 article EN 2018-03-01

Compressive stress drives adhesion-dependent unjamming transitions in breast cancer cell migration

OPENALEX - Publications

Grace Cai Anh Tuan Nguyen Yashar Bashirzadeh Shan‐Shan Lin Dapeng Bi and 1 more

Cellular unjamming is the collective fluidization of cell motion and has been linked to many biological processes, including development, wound repair, tumor growth. In growth, uncontrolled proliferation cancer cells in a confined space generates mechanical compressive stress. However, because multiple cellular molecular mechanisms may be operating simultaneously, role stress transitions during progression remains unknown. Here, we investigate which mechanism dominates dense, mechanically...

10.3389/fcell.2022.933042 article EN cc-by Frontiers in Cell and Developmental Biology 2022-10-04

Predicting air quality index using attention hybrid deep learning and quantum-inspired particle swarm optimization

OPENALEX - Publications

Anh Tuan Nguyen Duy Hoang Pham Bee Lan Oo Yonghan Ahn Benson Teck‐Heng Lim

Abstract Air pollution poses a significant threat to the health of environment and human well-being. The air quality index (AQI) is an important measure that describes degree its impact on health. Therefore, accurate reliable prediction AQI critical but challenging due non-linearity stochastic nature particles. This research aims propose hybrid deep learning model based Attention Convolutional Neural Networks (ACNN), Autoregressive Integrated Moving Average (ARIMA), Quantum Particle Swarm...

10.1186/s40537-024-00926-5 article EN cc-by Journal Of Big Data 2024-05-11