- Advanced Malware Detection Techniques
- Software Testing and Debugging Techniques
- Software Engineering Research
- Security and Verification in Computing
- Adversarial Robustness in Machine Learning
- Topic Modeling
- Natural Language Processing Techniques
- Speech Recognition and Synthesis
- Cloud Computing and Resource Management
- Software System Performance and Reliability
- Anomaly Detection Techniques and Applications
- Digital and Cyber Forensics
- Speech and Audio Processing
- Bayesian Modeling and Causal Inference
- Software Reliability and Analysis Research
- Parallel Computing and Optimization Techniques
- Data Quality and Management
- Music and Audio Processing
- Speech and dialogue systems
- Advanced Decision-Making Techniques
- Distributed and Parallel Computing Systems
- Sentiment Analysis and Opinion Mining
- Advanced Neural Network Applications
- Web Application Security Vulnerabilities
- Explainable Artificial Intelligence (XAI)
University of Hong Kong
2020-2025
Hong Kong University of Science and Technology
2020-2025
Guangxi Medical University
2025
Jiangsu University of Science and Technology
2024
Peking University
2019-2024
Beijing Institute of Technology
2022-2024
Shenyang Agricultural University
2024
Yanshan University
2024
Zhejiang Provincial People's Hospital
2021-2024
Southeast Asia University
2024
With the thriving of mobile app markets, third-party libraries are pervasively integrated in Android applications. Third-party provide functionality such as advertisements, location services, and social networking making multi-functional development much more productive. However, spread vulnerable or harmful may also hurt entire ecosystem, leading to various security problems. The platform suffers severely from problems due way its ecosystem is constructed maintained. Therefore, library...
We theoretically analyze the Einstein-Podolsky-Rosen (EPR) correlation, quadrature squeezing, and continuous-variable quantum teleportation when considering non-Gaussian entangled states generated by applying multiple-photon subtraction addition to a two-mode squeezed vacuum state (TMSVs). Our results indicate that in case of multiple-photon-subtracted TMSVs with symmetric operations, corresponding EPR squeezing degree, sum fidelity teleporting coherent or can be enhanced for any parameter r...
Code completion, a highly valuable topic in the software development domain, has been increasingly promoted for use by recent advances large language models (LLMs). To date, visible LLM-based code completion frameworks such as GitHub Copilot and GPT are trained using deep learning over vast quantities of unstructured text open source code. As paramount component cornerstone daily programming tasks, largely boosted professionals' efficiency building real-world systems. In contrast to this...
More and more app developers use the packing services (or packers) to prevent attackers from reverse engineering modifying executable Dex files) of their apps. At same time, malware authors also packers hide malicious component evade signature-based detection. Although there are a few recent studies on unpacking Android apps, it has been shown that evolving can easily circumvent them because they not adaptive changes packers. In this paper, we propose novel approach develop new system, named...
Since more than 96 percent of mobile malware targets the Android platform, various techniques based on static code analysis or dynamic behavior have been proposed to detect malicious apps. As is becoming complicated and stealthy, recent research a promising detection approach that looks for inconsistency between an app's permissions its description. In this paper, we first revisit reveal using description permission will lead many false positives because descriptions often fail declare all...
Detecting similar functions in binary executables serves as a foundation for many code analysis and reuse tasks. By far, recognizing components remains challenge. Existing research employs either static or dynamic approaches to capture program syntax semantics-level features comparison. However, there exist multiple design limitations previous work, which result relatively high cost, low accuracy scalability, thus severely impede their practical use. In this paper, we present novel method...
The prosperous trend of deploying deep neural network (DNN) models to diverse hardware platforms has boosted the development learning (DL) compilers. DL compilers take high-level DNN model specifications as input and generate optimized executables for architectures like CPUs, GPUs, various accelerators. Compiling into high-efficiency is not easy: compilation procedure often involves converting several different intermediate representations (IR), e.g., graph IR operator IR, performing...
Siddharth Varia, Shuai Wang, Kishaloy Halder, Robert Vacareanu, Miguel Ballesteros, Yassine Benajiba, Neha Anna John, Rishita Anubhai, Smaranda Muresan, Dan Roth. Proceedings of the 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis. 2023.
In light of the growing popularity Exploratory Data Analysis (EDA), understanding underlying causes knowledge acquired by EDA is crucial. However, it remains under-researched. This study promotes a transparent and explicable perspective on data analysis, called eXplainable (XDA). For this reason, we present XInsight, general framework for XDA. XInsight provides analysis with qualitative quantitative explanations causal non-causal semantics. way, will significantly improve human confidence in...
The prosperous trend of deploying complex applications to web browsers has boosted the development WebAssembly (wasm) compilation toolchains. Software written in different high-level programming languages are compiled into wasm executables, which can be executed fast and safely a virtual machine. performance executables depends highly on compiler optimizations. Despite use recent research indicated that real-world slower than anticipated, suggesting deficiencies
Recent advances in large language models (LLMs) significantly boost their usage software engineering. However, training a well-performing LLM demands substantial workforce for data collection and annotation. Moreover, datasets may be proprietary or partially open, the process often requires costly GPU cluster. The intellectual property value of commercial LLMs makes them attractive targets imitation attacks, but creating an model with comparable parameters still incurs high costs. This...
Software instrumentation techniques are widely used in program analysis tasks such as profiling, vulnerability discovering, and security-oriented transforming. In this paper, we present an tool called UROBOROS, which supports static on stripped binaries. Due to the lack of relocation debug information, reverse engineering binaries is challenging. Compared with previous work, UROBOROS can provide complete, easy-to-use, transparent, efficient complete by statically recovering relocatable...
The Markov decision process (MDP) provides a mathematical frame- work for modeling sequential decision-making problems, many of which are crucial to security and safety, such as autonomous driving robot control. rapid development artificial intelligence research has created efficient methods solving MDPs, deep neural networks (DNNs), reinforcement learning (RL), imitation (IL). However, these popular models MDPs neither thoroughly tested nor rigorously reliable.
Function recognition in program binaries serves as the foundation for many binary instrumentation and analysis tasks. However, are usually stripped before distribution, function information is indeed absent most binaries. By far, identifying functions remains a challenge. Recent research work proposes to recognize code through machine learning techniques. The model, including typical entry point patterns, automatically constructed learning. we observed that previous only leverages...
A C decompiler converts an executable (the output from a compiler) into source code. The recovered code, once recompiled, will produce with the same functionality as original executable. With over twenty years of development, decompilers have been widely used in production to support reverse engineering applications, including legacy software migration, security retrofitting, comprehension, and act first step launching adversarial exploitations. As paramount component trust base numerous...
The rise of large language model-based code generation (LLCG) has enabled various commercial services and APIs. Training LLCG models is often expensive time-consuming, the training data are large-scale even inaccessible to public. As a result, risk intellectual property (IP) theft over (e.g., via imitation attacks) been serious concern. In this paper, we propose first watermark (WM) technique protect APIs from remote attacks. Our proposed based on replacing tokens in an output with their...
Smartphone users are installing more and bigger apps. At the meanwhile, each app carries considerable amount of unused stuff, called software bloat, in its apk file. As a result, resources smartphone, such as hard disk network bandwidth, has become even insufficient than ever before. Therefore, it is critical to investigate existing apps on market development identify sources bloat develop techniques tools remove bloat. In this paper, we present comprehensive study Android applications,...
Cryptographic implementations bolster security against timing side-channel attacks by integrating constant-time components. However, the new ciphertext side channels resulting from deterministic memory encryption in Trusted Execution Environments (TEEs), enable ciphertexts to manifest identifiable patterns when being sequentially written same address. Attackers with read access encrypted TEEs can potentially deduce plaintexts analyzing these changing patterns. In this paper, we design...
LLMs increasingly serve as general-purpose AI assistants in daily life, and their subtly unethical suggestions become a serious real concern. It is demanding to test mitigate such from LLMs. Despite existing efforts detect violations of “testable” facets ethics (e.g., fairness testing), it challenging encode the full scope justice, deontology) into oracle without human annotations or intervention. In this paper, we take inspiration reflective equilibrium, modern moral reasoning method...