- Software Engineering Research
- Software Testing and Debugging Techniques
- Adversarial Robustness in Machine Learning
- Software Reliability and Analysis Research
- Machine Learning and Data Classification
- Natural Language Processing Techniques
- Topic Modeling
- Advanced Malware Detection Techniques
- Software System Performance and Reliability
- Text and Document Classification Technologies
- Web Data Mining and Analysis
- Privacy-Preserving Technologies in Data
- Green IT and Sustainability
- Web Application Security Vulnerabilities
- Advanced Neural Network Applications
- Wireless Signal Modulation Classification
- Artificial Intelligence in Healthcare and Education
- Explainable Artificial Intelligence (XAI)
- Wireless Communication Security Techniques
- Millimeter-Wave Propagation and Modeling
- Educational Games and Gamification
- Advanced Data Storage Technologies
- Cognitive Radio Networks and Spectrum Sensing
- Advanced MIMO Systems Optimization
York University
2021-2025
University of Waterloo
2019-2020
Automated generate-and-validate (GV) program repair techniques (APR) typically rely on hard-coded rules, thus only fixing bugs following specific fix patterns. These rules require a significant amount of manual effort to discover and it is hard adapt these different programming languages.
Automatic API recommendation has been studied for years. There are two orthogonal lines of approaches this task, i.e., information-retrieval-based (IR-based) and neural-based methods. Although these were reported having remarkable performance, our observation shows that existing can fail due to the following reasons: 1) most IR-based treat task queries as bag-of-words use word embedding represent queries, which cannot capture sequential semantic information. 2) both weak at distinguishing...
Automatic unit test generation that explores the input space and produces effective cases for given programs have been studied decades. Many tools can help generate with high structural coverage over a program examined. However, fact existing are mainly evaluated on general software calls into question about its practical effectiveness usefulness machine learning libraries, which statistically orientated fundamentally different nature construction from projects. In this paper, we set out to...
In recent years, the practice of fuzzing Deep Learning (DL) APIs has received significant attention in software engineering community. Many API-level DL fuzzers have been proposed to test individual by generating malformed input. Although these effective detecting bugs and outperforming prior work, there remains a gap bench-marking them against ground-truth, real-world libraries. Existing comparisons among primarily focus on detected but do not offer comprehensive, in-depth evaluation...
Deep Learning (DL) libraries have significantly impacted various domains in computer science over the last decade. However, developers often face challenges when using DL APIs, as development paradigm of applications differs greatly from traditional software development. Existing studies on API misuse mainly focus software, leaving a gap understanding within APIs. To address this gap, we present first comprehensive study TensorFlow and PyTorch. Specifically, collected dataset 4,224 commits...
Recently, deep learning models have shown promising results in test oracles generation.Static evaluation metrics from Natural Language Generation (NLG) such as BLEU, CodeBLEU, ROUGE-L, METEOR, and Accuracy, which is mainly based on textual comparisons, been widely adopted to measure the performance of Neural Oracle (NOG) models.However, these NLG-based may not reflect testing effectiveness generated oracle within a suite, often measured by dynamic (execution-based) adequacy code coverage...
Automatic API recommendation can accelerate developers’ programming, and has been studied for years. There are two orthogonal lines of approaches this task, i.e., information retrieval-based (IR-based) sequence to (seq2seq) model based approaches. Although these were reported have remarkable performance, our observation finds major drawbacks, IR-based lack the consideration relations among recommended APIs, seq2seq models do not API’s semantic meaning. To alleviate above problems, we propose...
Automated generate-and-validate (G&V) program repair techniques typically rely on hard-coded rules, only fix bugs following specific patterns, and are hard to adapt different programming languages. We propose ENCORE, a new G&V technique, which uses ensemble learning convolutional neural machine translation (NMT) models automatically in multiple take advantage of the randomness hyper-parameter tuning build that combine them using learning. This NMT approach outperforms standard long...
Application Programming Interfaces (APIs) are designed to help developers build software more effectively. Recommending the right APIs for specific tasks is gaining increasing attention among researchers and developers. However, most of existing approaches mainly evaluated general programming using statically typed languages such as Java. Little known about their practical effectiveness usefulness machine learning (ML) with dynamically Python, whose paradigms fundamentally different from...
Deep learning-based code processing models have shown good performance for tasks such as predicting method names, summarizing programs, and comment generation. However, despite the tremendous progress, deep learning are often prone to adversarial attacks, which can significantly threaten robustness generalizability of these by leading them misclassification with unexpected inputs. To address above issue, many testing approaches been proposed, however, mainly focus on applications in domains...
Machine learning (ML) has been increasingly used in a variety of domains, while solving ML programming tasks poses unique challenges due to the fundamental difference nature and construct general tasks, especially for developers who do not have backgrounds. Automatic code generation that produces snippet from natural language description can be promising technique accelerate tasks. In recent years, although many deep learning-based neural models proposed with high accuracy, fact most them...
Split learning is a privacy-preserving distributed paradigm in which an ML model (e.g., neural network) split into two parts (i.e., encoder and decoder). The shares so-called latent representation, rather than raw data, for training. In mobile-edge computing, network functions (such as traffic forecasting) can be trained via where resides user equipment (UE) decoder the edge network. Based on data processing inequality information bottleneck (IB) theory, we present new framework training...
Deep learning (DL)-based code processing models have demonstrated good performance for tasks such as method name prediction, program summarization, and comment generation. However, despite the tremendous advancements, DL are frequently susceptible to adversarial attacks, which pose a significant threat robustness generalizability of these by causing them misclassify unexpected inputs. To address issue above, numerous testing approaches been proposed; however, primarily target applications in...
Recently, many Deep Learning (DL) fuzzers have been proposed for API-level testing of DL libraries. However, they either perform unguided input generation (e.g., not considering the relationship between API arguments when generating inputs) or only support a limited set corner-case test inputs. Furthermore, developer APIs crucial library development remain untested, as are typically well documented and lack clear usage guidelines, unlike end-user APIs. This makes them more challenging target...
Checker bugs in Deep Learning (DL) libraries are critical yet not well-explored. These often concealed the input validation and error-checking code of DL can lead to silent failures, incorrect results, or unexpected program behavior applications. Despite their potential significantly impact reliability performance DL-enabled systems built with these libraries, checker have received limited attention. We present first comprehensive study two widely-used i.e., TensorFlow PyTorch. Initially, we...
A scalable and immersive game was developed to serve as a monthly concept review for theory-heavy engineering courses (such fluid dynamics or heat transfer). It designed such that in-game items content may be dynamically replaced easily with an Excel data table, without the need further programming. is expected course instructors use tool by simply updating table rapidly tailor any course. Even room/zone designs parametrized using table. Given automation level of tool, its scalability...
We introduce SkipAnalyzer, a large language model (LLM)-powered tool for static code analysis. SkipAnalyzer has three components: 1) an LLM-based bug detector that scans source and reports specific types of bugs, 2) false-positive filter can identify bugs in the results detectors (e.g., result step to improve detection accuracy, 3) patch generator generate patches detected above. As proof-of-concept, is built on ChatGPT, which exhibited outstanding performance various software engineering...
Machine learning (ML) has been increasingly used in a variety of domains, while solving ML programming tasks poses unique challenges because the fundamentally different nature and construction from general tasks, especially for developers who do not have backgrounds. Automatic code generation that produces snippet natural language description can be promising technique to accelerate tasks. In recent years, although many deep learning-based neural models proposed with high accuracy, fact most...
Many automatic unit test generation tools that can generate cases with high coverage over a program have been proposed. However, most of these are ineffective on deep learning (DL) frameworks due to the fact many APIs expect inputs follow specific API knowledge. To fill this gap, we propose MUTester for by leveraging constraints mined from corresponding documentation and usage patterns code fragments in Stack Overflow (SO). Particularly, first set 18 rules mining documents. We then use...
Split learning is a privacy-preserving distributed paradigm in which an ML model (e.g., neural network) split into two parts (i.e., encoder and decoder). The shares so-called latent representation, rather than raw data, for training. In mobile-edge computing, network functions (such as traffic forecasting) can be trained via where resides user equipment (UE) decoder the edge network. Based on data processing inequality information bottleneck (IB) theory, we present new framework training...
In this work, we revisit existing oracle generation studies plus ChatGPT to empirically investigate the current standing of their performance in both NLG-based and test adequacy metrics. Specifically, train run four state-of-the-art models on five two metrics for our analysis. We apply different correlation analyses between these sets Surprisingly, found no significant For instance, oracles generated from project activemq-artemis had highest all among studied NOGs, however, it most number...
Application Programming Interfaces (APIs) are designed to help developers build software more effectively. Recommending the right APIs for specific tasks has gained increasing attention among researchers and in recent years. To comprehensively understand this research domain, we have surveyed analyze API recommendation studies published last 10 Our study begins with an overview of structure tools. Subsequently, systematically prior pose four key questions. For RQ1, examine volume papers...