Peiyi Han

ORCID: 0000-0003-0417-4473
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Cryptography and Data Security
  • Anomaly Detection Techniques and Applications
  • Privacy-Preserving Technologies in Data
  • Cloud Data Security Solutions
  • Digital and Cyber Forensics
  • Network Security and Intrusion Detection
  • Advanced Malware Detection Techniques
  • Adversarial Robustness in Machine Learning
  • Advanced Steganography and Watermarking Techniques
  • Domain Adaptation and Few-Shot Learning
  • Software System Performance and Reliability
  • Time Series Analysis and Forecasting
  • Complexity and Algorithms in Graphs
  • Machine Learning and Algorithms
  • Chaos-based Image/Signal Encryption
  • Internet Traffic Analysis and Secure E-voting
  • Advanced Data Storage Technologies
  • AI in cancer detection
  • User Authentication and Security Systems
  • Bacillus and Francisella bacterial research
  • Digital Media Forensic Detection
  • Physical Unclonable Functions (PUFs) and Hardware Security
  • Machine Learning and Data Classification
  • Topic Modeling
  • Cryptographic Implementations and Security

Harbin Institute of Technology
2016-2025

Peng Cheng Laboratory
2021-2025

Shenzhen Institute of Information Technology
2020-2022

Beijing University of Posts and Telecommunications
2016-2020

Recently, self-play fine-tuning (SPIN) has garnered widespread attention as it enables large language models (LLMs) to iteratively enhance their capabilities through simulated interactions with themselves, transforming a weak LLM into strong one. However, applying SPIN fine-tune text-to-SQL presents substantial challenges. Notably, existing frameworks lack clear signal feedback during the training process and fail adequately capture implicit schema-linking characteristics between natural...

10.3390/e27030235 article EN cc-by Entropy 2025-02-25

In this paper, we propose a method, namely Goalie, to defend against the correlated value and sign encoding attacks used steal shared data from trusts. Existing methods prevent these by perturbing model parameters, gradients, or training while significantly degrading performance. To guarantee performance of benign models, Goalie detects malicious models stops their training. The key insight detection is that additional information in parameters through regularization terms changes parameter...

10.3390/e27030323 article EN cc-by Entropy 2025-03-20

Anomaly detection for log sequences is a necessary task system intelligent operation and fault diagnosis. In sequence, adjacent logs have the property of local correlation, while long-distance remote dependencies. It helpful to fully mine these information during modeling improving performance anomaly detection. Meanwhile, there are some redundant or noise in which has no contribution detection, may even bring negative impact. The existing methods sequence do not take above problems into...

10.1109/tnsm.2021.3125967 article EN IEEE Transactions on Network and Service Management 2021-11-08

Searchable Encryption (SE) has been extensively examined by both academic and industry researchers. While many SE schemes show provable security, they usually expose some query information (e.g., search access patterns) to achieve high efficiency. However, several inference attacks have exploited such leakage, e.g., a recovery attack can convert opaque trapdoors their corresponding keywords based on prior knowledge. On the other hand, proposed require significant modification of existing...

10.1109/access.2017.2786026 article EN cc-by-nc-nd IEEE Access 2017-12-27

The multi-label recognition of damaged waste bottles has important significance in environmental protection. However, most the previous methods are known for their poor performance, especially regards to bottle classification. In this paper, we propose use a serial attention frame (SAF) overcome mentioned drawback. proposed network architecture includes following three parts: residual learning block (RB), mixed (MAB), and self-attention (SAB). RB uses ResNet pretrain SAF extract more...

10.3390/app12031742 article EN cc-by Applied Sciences 2022-02-08

Searchable encryption (SE) schemes, such as those deployed for cyber-physical social systems, may be vulnerable to inference attacks. In attacks, attackers seek learn sensitive information about the queries and data stored on (cyber-physical social) systems. However, these attacks are often based strong (impractical) assumptions (e.g., complete knowledge of documents or known document injection) using access-pattern leakage. this paper, we first identify different leakage models profiles...

10.1109/access.2018.2800684 article EN cc-by-nc-nd IEEE Access 2018-01-01

Large Language Model-based (LLM-based) Text-to-SQL methods have achieved important progress in generating SQL queries for real-world applications. When confronted with table content-aware questions scenarios, ambiguous data content keywords and non-existent database schema column names within the question leads to poor performance of existing methods. To solve this problem, we propose a novel approach towards Table Content-aware Self-Retrieval (TCSR-SQL). It leverages LLM's in-context...

10.48550/arxiv.2407.01183 preprint EN arXiv (Cornell University) 2024-07-01

Encryption will play a more important role as computation and storage are outsourced. Achieving good balance between strong security application functionality preservation becomes cutting-edge research problem. SE algorithms well studied, which delegate search capabilities to the cloud provider without decrypting documents. But this approach imposes extra constraints on API loses query expressiveness. This paper proposes CASB based framework for encrypted data sharing, builds index locally...

10.1109/iccnc.2017.7876165 article EN 2016 International Conference on Computing, Networking and Communications (ICNC) 2017-01-01

Browser-based cloud storage services are still broadly used in enterprises for online sharing and collaboration. However, sensitive information images or documents may be easily leaked outside trusted enterprise on-premises due to such services. Existing solutions prevent data leakage either limit many functionalities of applications difficult scaled various applications. In this paper, we propose CloudDLP, a transparent scalable approach automatically sanitize with browser-based CloudDLP is...

10.1109/access.2020.2985870 article EN cc-by IEEE Access 2020-01-01

According to the statistics of BrightPlanet in 2012, information contained Deep Web is 400-500 times more than those Surface Web. Automatic data collection under has become one research hotspots domain web crawlers. During exploration automatic dynamic crawlers, we discover two inevitable situations which have not been processed properly by existing tools. One that some websites adopt CSS pseudo-class locate clickable elements, prevents technologies from simulating user actions. Another...

10.1109/dsc.2018.00042 article EN 2018-06-01

Confidential data is often encrypted before it uploaded to cloud servers. However, client-controlled encryption poses a major barrier towards the full functionalities of services. This paper presents SafeBox, new Cloud Access Security Broker (CASB)-based approach that protects sensitive information against attackers with control servers, and allows clients search share transparently. It addresses following challenges: First, SafeBox brings almost no loss for protecting in applications....

10.1109/spac.2017.8304356 article EN 2021 International Conference on Security, Pattern Analysis, and Cybernetics(SPAC) 2017-12-01

Swarm learning (SL) is an emerging promising decentralized machine paradigm and has achieved high performance in clinical applications. SL solves the problem of a central structure federated by combining edge computing blockchain-based peer-to-peer network. While there are results assumption independent identically distributed (IID) data across participants, suffers from degradation as degree non-IID increases. To address this problem, we propose generative augmentation framework swarm...

10.1109/icdis55630.2022.00058 article EN 2022-08-01

The primary business challenge for the customers to use outsourced computation and storage is loss of data control security. So encryption will become a commodity in near future. There big diffusion with above scenario: take advantage current application’s full functionalities at same time ensuring their sensitive remains protected under customers’ control. Prior works have achieved effective progress towards satisfying both sides. But there are still some technical challenges, such as...

10.1155/2016/8057208 article EN cc-by Scientific Programming 2016-01-01

Federated learning (FL) allows decentralized medical institutions to collaboratively learn a shared global model without breaching data privacy. However, in the context of image segmentation, distributions across centers may vary lot due diverse imaging protocols, vendors and partial annotation, which usually hampers optimization convergence performance FL. In this paper, we propose novel approach called federated knowledge augmentation (FedKA) address non-IID (non-independent identically...

10.1109/icassp48485.2024.10445902 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

Abstract Real‐world data always exhibit an imbalanced and long‐tailed distribution, which leads to poor performance for neural network‐based classification. Existing methods mainly tackle this problem by reweighting the loss function or rebalancing classifier. However, one crucial aspect overlooked previous research studies is feature space caused angle distribution. In paper, authors shed light on significance of distribution in achieving a balanced space, essential improving model under...

10.1049/cit2.12374 article EN cc-by-nd CAAI Transactions on Intelligence Technology 2024-10-24
Coming Soon ...