Xuhong Zhang

ORCID: 0000-0002-8571-9780
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Adversarial Robustness in Machine Learning
  • Advanced Malware Detection Techniques
  • Topic Modeling
  • Advanced Data Storage Technologies
  • Security and Verification in Computing
  • Cloud Computing and Resource Management
  • Privacy-Preserving Technologies in Data
  • Network Security and Intrusion Detection
  • Software Engineering Research
  • Natural Language Processing Techniques
  • Anomaly Detection Techniques and Applications
  • Distributed and Parallel Computing Systems
  • Software Testing and Debugging Techniques
  • Explainable Artificial Intelligence (XAI)
  • High-Energy Particle Collisions Research
  • Caching and Content Delivery
  • Cryptography and Data Security
  • Web Application Security Vulnerabilities
  • Advanced Graph Neural Networks
  • Digital Media Forensic Detection
  • Complex Network Analysis Techniques
  • Face recognition and analysis
  • Statistical Mechanics and Entropy
  • Generative Adversarial Networks and Image Synthesis
  • Digital and Cyber Forensics

Ningbo University
2022-2025

Zhejiang University
2020-2025

Zhejiang University of Science and Technology
2023-2025

Xi'an University of Technology
2023-2024

Guangdong Polytechnic Normal University
2013-2023

Shanxi University
2021-2023

Institute of Theoretical Physics
2021-2023

National University of Defense Technology
2022

State Key Laboratory of Quantum Optics and Quantum Optics Devices
2021-2022

Binzhou University
2022

Pre-trained general-purpose language models have been a dominating component in enabling real-world natural processing (NLP) applications. However, pre-trained model with backdoor can be severe threat to the Most existing attacks NLP are conducted fine-tuning phase by introducing malicious triggers targeted class, thus relying greatly on prior knowledge of task. In this paper, we propose new approach map inputs containing directly predefined output representation models, e.g., for...

10.1145/3460120.3485370 article EN Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security 2021-11-12

Deep neural networks (DNNs) have demonstrated their outperformance in various domains. However, it raises a social concern whether DNNs can produce reliable and fair decisions especially when they are applied to sensitive domains involving valuable resource allocation, such as education, loan, employment. It is crucial conduct fairness testing before reliably deployed domains, i.e., generating many instances possible uncover violations. the existing methods still limited from three aspects:...

10.1145/3510003.3510123 article EN Proceedings of the 44th International Conference on Software Engineering 2022-05-21

Vertical Federated Learning (VFL) is a trending collaborative machine learning model training solution. Existing industrial frameworks employ secure multi-party computation techniques such as homomorphic encryption to ensure data security and privacy. Despite these efforts, studies have revealed that leakage remains risk in VFL due the correlations between intermediate representations raw data. Neural networks can accurately capture correlations, allowing an adversary reconstruct This...

10.1109/tifs.2024.3356164 article EN IEEE Transactions on Information Forensics and Security 2024-01-01

Recently, the use of large language models (LLMs) for Verilog code generation has attracted great research interest to enable hardware design automation. However, previous works have shown a gap between ability LLMs and practical demands description (HDL) engineering. This includes differences in how engineers phrase questions hallucinations generated. To address these challenges, we introduce HaVen, novel LLM framework designed mitigate align with practices HDL engineers. HaVen tackles...

10.48550/arxiv.2501.04908 preprint EN arXiv (Cornell University) 2025-01-08

DeepFakes pose a significant threat to our society. One representative DeepFake application is face-swapping, which replaces the identity in facial image with that of victim. Although existing methods partially mitigate these risks by degrading quality swapped images, they often fail disrupt transformation effectively. To fill this gap, we propose FaceSwapGuard (FSG), novel black-box defense mechanism against deepfake face-swapping threats. Specifically, FSG introduces imperceptible...

10.48550/arxiv.2502.10801 preprint EN arXiv (Cornell University) 2025-02-15

With the continuous advancement of machine learning, numerous malware detection methods that leverage this technology have emerged, presenting new challenges to generation adversarial malware. Existing function-preserving attacks fall short effectively modifying portable executable (PE) control flow graphs (CFGs), thereby failing bypass graph neural network (GNN) models utilize CFGs for detection. To solve issue, we introduce a novel base modification method called active opcode insertion,...

10.1038/s41598-025-92023-7 article EN cc-by-nc-nd Scientific Reports 2025-03-17

As the core of IoT devices, firmware is undoubtedly vital. Currently, development heavily depends on third-party components (TPCs), which significantly improves efficiency and reduces cost. Nevertheless, TPCs are not secure, vulnerabilities in will turn back influence security firmware. existing works pay less attention to caused by TPCs, we still lack a comprehensive understanding impact TPC vulnerability against To fill knowledge gap, design implement FirmSec, leverages syntactical...

10.1145/3533767.3534366 article EN 2022-07-15

Vertical federated learning (VFL) is an emerging privacy-preserving paradigm that enables collaboration between companies. These companies have the same set of users but different features. One them interested in expanding new business or improving its current service with others' For instance, e-commerce company, who wants to improve recommendation performance, can incorporate users' preferences from another corporation such as a social media company through VFL. On other hand, graph data...

10.1109/tdsc.2022.3208630 article EN IEEE Transactions on Dependable and Secure Computing 2022-09-22

In recent years, DeepFake technologies have seen widespread adoption in various domains, including entertainment and film production. However, they also been maliciously employed for disseminating false information engaging video fraud. Existing detection methods often experience significant performance degradation when confronted with unknown forgeries or exhibit limitations dealing low-quality images. To address this challenge, we introduce <italic...

10.1109/tdsc.2024.3364679 article EN IEEE Transactions on Dependable and Secure Computing 2024-02-21

The distributed file system, HDFS, is widely deployed as the bedrock for many parallel big data analysis. However, when running multiple applications over shared requests from different processes/executors will unfortunately be served in a surprisingly imbalanced fashion on storage servers. These access patterns among nodes are caused because a). unlike conventional system using striping policies to evenly distribute nodes, data-intensive such HDFS store each unit, referred chunk file, with...

10.1109/tc.2017.2749229 article EN publisher-specific-oa IEEE Transactions on Computers 2017-09-07

In this paper, we aim to enable both efficient and accurate approximations on arbitrary sub-datasets of a large dataset. Due the prohibitive storage overhead caching offline samples for each sub-dataset, existing sample based systems provide high accuracy results only limited number sub-datasets, such as popular ones. On other hand, current online approximation systems, which generate at runtime, do not take into account uneven distribution sub-dataset. They work well uniform sub-dataset...

10.14778/3021924.3021928 article EN Proceedings of the VLDB Endowment 2016-11-01

The energy allocation strategy is one of the most popular techniques in fuzzing to improve code coverage and vulnerability discovery. core intuition that fuzzers should allocate more computational seed files have high efficiency trigger unique paths crashes after mutation. Existing solutions usually define several properties, e.g., execution speed, file size, number triggered edges control flow graph, serve as key measurements their logics estimate potential a seed. property assumed be same...

10.1145/3533767.3534385 article EN 2022-07-15

With the rapid technology evolution of Internet Things (IoT) and increasing user needs, IoT device re-using becomes more common nowadays. For instance, than 300,000 used devices are selling on Craigslist. During re-using, sensitive data such as credentials biometrics residing in these may face risk leakage if a fails properly dispose data. Thus, critical security concern is raised: do (or can) users IoT? To best our knowledge, it still an unexplored problem that desires systematic study.In...

10.1109/sp46215.2023.10179294 article EN 2022 IEEE Symposium on Security and Privacy (SP) 2023-05-01

With the wide application and deployment of cloud computing in enterprises, virtualization developers security researchers are paying more attention to security. The core component products is hypervisor, which also known as virtual machine monitor (VMM) that can isolate multiple machines one host machine. However, compromising hypervisor lead escape elevation privilege, allowing attackers gain permission code execution host. Therefore, analysis vulnerability detection critical for...

10.1145/3460120.3484811 article EN Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security 2021-11-12

Despite of its tremendous popularity and success in computer vision (CV) natural language processing, deep learning is inherently vulnerable to adversarial attacks which examples (AEs) are carefully crafted by imposing imperceptible perturbations on the clean deceive target neural networks (DNNs). Many defense solutions CV have been proposed. However, most them, e.g., training, suffer from a low generality due reliance limited AEs. Moreover, some even non-negligible negative impact...

10.1109/tdsc.2021.3124337 article EN IEEE Transactions on Dependable and Secure Computing 2021-11-02

Online Microlending, a new financial service, focuses on small loans without any sort of collateral. It provides more flexible and quicker funding for borrowers, as well higher interest rates return. For platforms that provide such services, an essential task is to adequately evaluate each loan's risk so minimize the possible loss. However, there exists special group namely <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">fraud-agents</i> ,...

10.1109/tdsc.2022.3151132 article EN IEEE Transactions on Dependable and Secure Computing 2022-02-14

Neural networks have become increasingly popular. Nevertheless, understanding their decision process turns out to be complicated. One vital method explain a models' behavior is feature attribution, i.e., attributing its pivotal features. Although many algorithms are proposed, most of them aim improve the faithfulness (fidelity) model. However, real environment contains random noises, which may cause attribution maps greatly perturbed for similar images. More seriously, recent works show that...

10.1145/3548606.3559392 article EN Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security 2022-11-07

Mutation-based fuzzing is one of the most popular approaches to discover vulnerabilities in a program.To alleviate inefficiency mutation-based incurred by high randomness mutation process, multiple solutions are developed recent years, especially coverage-based fuzzing.They mainly employ adaptive strategies or integrate constraint-solving techniques make good exploration test cases which trigger unique paths and crashes.However, they lack fine-grained reusing history construct these...

10.14722/ndss.2022.23162 article EN 2022-01-01

10.1016/j.physa.2019.04.031 article EN Physica A Statistical Mechanics and its Applications 2019-04-05

In this paper, we study the problem of sub-dataset analysis over distributed file systems, e.g., Hadoop system. Our experiments show that sub-datasets distribution HDFS blocks, which is hidden by HDFS, can often cause corresponding analyses to suffer from a seriously imbalanced or inefficient parallel execution. Specifically, content clustering results in some computational nodes carrying out much more workload than others; furthermore, it leads sampling sub-datasets, as programs will read...

10.1109/tbdata.2016.2632744 article EN publisher-specific-oa IEEE Transactions on Big Data 2016-11-29
Coming Soon ...