- Cryptography and Data Security
- Advanced Authentication Protocols Security
- Security in Wireless Sensor Networks
- Privacy-Preserving Technologies in Data
- Chaos-based Image/Signal Encryption
- User Authentication and Security Systems
- Mobile Ad Hoc Networks
- Cryptography and Residue Arithmetic
- Network Security and Intrusion Detection
- Advanced Neural Network Applications
- Mobile Agent-Based Network Management
- Internet Traffic Analysis and Secure E-voting
- Topic Modeling
- Energy Efficient Wireless Sensor Networks
- Privacy, Security, and Data Protection
- Quantum Computing Algorithms and Architecture
- Cloud Data Security Solutions
- Advanced Steganography and Watermarking Techniques
- RFID technology advancements
- Natural Language Processing Techniques
- Access Control and Trust
- Cooperative Communication and Network Coding
- Cognitive Radio Networks and Spectrum Sensing
- Opportunistic and Delay-Tolerant Networks
- Wireless Communication Security Techniques
IIT@MIT
2025
Bayi Children's Hospital
2025
Zhejiang University
2008-2024
City University of Macau
2023-2024
Hangzhou City University
2023-2024
Zhejiang Gongshang University
2013-2023
Yunnan University
2023
University of Jinan
2023
China University of Geosciences
2021-2023
State Nuclear Power Technology Company (China)
2023
Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems with limited hardware resources. To address this limitation, we introduce "deep compression", a three stage pipeline: pruning, trained quantization Huffman coding, that work together reduce the storage requirement of neural by 35x 49x without affecting their accuracy. Our method first prunes network learning only important connections. Next, quantize weights enforce...
Watermarking, which belong to the information hiding field, has seen a lot of research interest. There is work begin conducted in different branches this field. Steganography used for secret communication, whereas watermarking content protection, copyright management, authentication and tamper detection. In paper we present detailed survey existing newly proposed steganographic techniques. We classify techniques based on domains data embedded. limit images only.
Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems. Also, conventional fix the architecture before training starts; as a result, cannot improve architecture. To address these limitations, we describe method reduce storage computation required by neural an order of magnitude without affecting their accuracy learning only important connections. Our prunes redundant connections using three-step method. First, train...
With the increasingly powerful and extensive deployment of edge devices, edge/fog computing enables customers to manage analyze data locally, extends power analysis applications network edges. Meanwhile, as next generation grid, smart grid can achieve goal efficiency, economy, security, reliability, use safety environmental friendliness for grid. However, privacy secure issues in fog-based communications are challenging. Without proper protection, customers' will be readily violated. This...
Large language models (LLMs) have shown excellent performance on various tasks, but the astronomical model size raises hardware barrier for serving (memory size) and slows down token generation bandwidth). In this paper, we propose Activation-aware Weight Quantization (AWQ), a hardware-friendly approach LLM low-bit weight-only quantization. Our method is based observation that weights are not equally important: protecting only 1% of salient can greatly reduce quantization error. We then to...
Large language models (LLMs) have transformed numerous AI applications. On-device LLM is becoming increasingly important: running LLMs locally on edge devices can reduce cloud computing costs and protect users' privacy. However, the astronomical model size limited hardware resources pose significant deployment challenges. To solve these issues, we propose Activation-aware Weight Quantization (AWQ) TinyChat, an algorithm-system full-stack solution for efficient on-device deployment. AWQ a...
Cyber-physical systems (CPSs) integrate the computation with physical processes. Embedded computers and networks monitor control processes, usually feedback loops where processes affect computations vice versa. CPS was identified as one of eight research priority areas in August 2007 report President's Council Advisors on Science Technology, will be core component many critical infrastructures industrial near future. However, a variety random failures cyber attacks exist CPS, which greatly...
State-of-the-art deep neural networks (DNNs) have hundreds of millions connections and are both computationally memory intensive, making them difficult to deploy on embedded systems with limited hardware resources power budgets. While custom helps the computation, fetching weights from DRAM is two orders magnitude more expensive than ALU operations, dominates required power. Previously proposed 'Deep Compression' makes it possible fit large DNNs (AlexNet VGGNet) fully in on-chip SRAM. This...
Online anomaly detection (AD) is an important technique for monitoring wireless sensor networks (WSNs), which protects WSNs from cyberattacks and random faults. As a scalable parameter-free unsupervised AD technique, $(k)$-nearest neighbor (kNN) algorithm has attracted lot of attention its applications in computer WSNs. However, the nature lazy-learning makes kNN-based schemes difficult to be used online manner, especially when communication cost constrained. In this paper, new scheme based...
Wireless body area networks (WBANs), as a promising health-care system, can provide tremendous benefits for timely and continuous patient care remote health monitoring. Owing to the restriction of communication, computation power in WBANs, cloud-assisted which offer more reliable, intelligent, services mobile users patients, are receiving increasing attention. However, how aggregate data multifunctionally efficiently is still an open issue cloud server (CS). In this paper, we propose...
On-device training enables the model to adapt new data collected from sensors by fine-tuning a pre-trained model. Users can benefit customized AI models without having transfer cloud, protecting privacy. However, memory consumption is prohibitive for IoT devices that have tiny resources. We propose an algorithm-system co-design framework make on-device possible with only 256KB of memory. faces two unique challenges: (1) quantized graphs neural networks are hard optimize due low bit-precision...
Large language models (LLMs) show excellent performance but are compute- and memory-intensive. Quantization can reduce memory accelerate inference. However, existing methods cannot maintain accuracy hardware efficiency at the same time. We propose SmoothQuant, a training-free, accuracy-preserving, general-purpose post-training quantization (PTQ) solution to enable 8-bit weight, activation (W8A8) for LLMs. Based on fact that weights easy quantize while activations not, SmoothQuant smooths...
Privacy-preserving federated learning, as one of the privacy-preserving computation techniques, is a promising distributed and machine learning (ML) approach for Internet Medical Things (IoMT), due to its ability train regression model without collecting raw data owners (DOs). However, traditional interactive training (IFRT) schemes rely on multiple rounds communication global are still under various privacy security threats. To overcome these problems, several noninteractive (NFRT) have...
We present LongLoRA, an efficient fine-tuning approach that extends the context sizes of pre-trained large language models (LLMs), with limited computation cost. Typically, training LLMs long is computationally expensive, requiring extensive hours and GPU resources. For example, on length 8192 needs 16x computational costs in self-attention layers as 2048. In this paper, we speed up extension two aspects. On one hand, although dense global attention needed during inference, model can be...
Long Short-Term Memory (LSTM) is widely used in speech recognition. In order to achieve higher prediction accuracy, machine learning scientists have built larger and models. Such large model both computation intensive memory intensive. Deploying such bulky results high power consumption leads total cost of ownership (TCO) a data center. speedup the make it energy efficient, we first propose load-balance-aware pruning method that can compress LSTM size by 20x (10x from 2x quantization) with...
Deep learning on point clouds has received increased attention thanks to its wide applications in AR/VR and autonomous driving. These require low latency high accuracy provide real-time user experience ensure safety. Unlike conventional dense workloads, the sparse irregular nature of poses severe challenges running CNNs efficiently general-purpose hardware. Furthermore, existing acceleration techniques for 2D images do not translate 3D clouds. In this paper, we introduce TorchSparse, a...
Transfer learning is important for foundation models to adapt downstream tasks. However, many are proprietary, so users must share their data with model owners fine-tune the models, which costly and raise privacy concerns. Moreover, fine-tuning large computation-intensive impractical most users. In this paper, we propose Offsite-Tuning, a privacy-preserving efficient transfer framework that can billion-parameter without access full model. offsite-tuning, owner sends light-weight adapter...