- Advanced Neural Network Applications
- Parallel Computing and Optimization Techniques
- Interconnection Networks and Systems
- Adversarial Robustness in Machine Learning
- Advanced Memory and Neural Computing
- Advanced Image and Video Retrieval Techniques
- Security and Verification in Computing
- Domain Adaptation and Few-Shot Learning
- CCD and CMOS Imaging Sensors
- Adaptive Control of Nonlinear Systems
- Vehicle License Plate Recognition
- Cryptography and Data Security
- Low-power high-performance VLSI design
- Cryptographic Implementations and Security
- Advanced Image Processing Techniques
- Genetics, Aging, and Longevity in Model Organisms
- Image Processing Techniques and Applications
- Iterative Learning Control Systems
- Phonocardiography and Auscultation Techniques
- Ship Hydrodynamics and Maneuverability
- Image and Video Stabilization
- Video Coding and Compression Technologies
- Radiomics and Machine Learning in Medical Imaging
- Heat Transfer and Optimization
- Soil Mechanics and Vehicle Dynamics
University of Chinese Academy of Sciences
2015-2025
Chinese Academy of Sciences
2015-2025
Institute of Computing Technology
2013-2025
Shandong University
2025
Southeast University
2024
PetroChina Southwest Oil and Gas Field Company (China)
2024
Shanghai Institute of Applied Physics
2024
Key Laboratory of Nuclear Radiation and Nuclear Energy Technology
2024
State Key Laboratory of Computer Architecture
2021-2023
Shanghai Innovative Research Center of Traditional Chinese Medicine
2023
Despite the high accuracy achieved by deep neural network (DNN) technique, there is still a lack of satisfying methodologies to protect intellectual property (IP) DNNs, which involves extensive valuable training data, abundant hardware resources, and fine-tuning skills experienced experts. Existing solutions based on watermarking cannot prevent malicious/unauthorized users from using well-trained DNNs. This paper proposes chaotic weights (ChaoWs), novel framework Chaotic Map theory, IP DNN...
Inference efficiency is the predominant consideration in designing deep learning accelerators. Previous work mainly focuses on skipping zero values to deal with remarkable ineffectual computation, while bits non-zero values, as another major source of often ignored. The reason lies difficulty extracting essential during operating multiply-and-accumulate (MAC) processing element. Based fact that occupy high 68.9% fraction overall weights modern convolutional neural network models, this paper...
Along with the rapid evolution of deep neural networks, ever-increasing complexity imposes formidable computation intensity to hardware accelerator. In this paper, we propose a novel computing philosophy called "bit interleaving" and associate accelerator design "Bitlet" maximally exploit bit-level sparsity. Apart from existing bit-serial/parallel accelerators, Bitlet leverages abundant "sparsity parallelism" in parameters enforce inference acceleration. is versatile by supporting diverse...
With the rapid development of drilling automation, how to automate real-time online measure fluid rheology accurately has become one problems that industry urgently wants solve. At present, common method is rheological measurement straight pipe fluid, but often pumping equipment will produce pulsating flow during reciprocating pumping, resulting in a large error data pressure difference and measurement, corresponding parameters obtained are not accurate. In order solve problem low accuracy...
Cloud service providers use workload consolidation technique in many-core cloud processors to optimize system utilization and augment performance for ever extending scale-out workloads. Performance isolation usually has be enforced the consolidated workloads sharing same resources. Networks-on-chip (NoC) serves as a major shared resource, also needs isolated avoid violating isolation. Prior work uses strict network fulfill However, either results low density, or complex routing mechanisms...
Networks-on-Chip (NoC) gradually becomes a main contributor of chip-level power consumption. Due to the temporal and spatial heterogeneity on-chip traffic, existing management approaches cannot adapt NoC consumption its traffic intensity, hence lead suboptimal efficiency. They either resort over-provisioned design that only suits for distribution, or coarse-grained gating serves variation. In this paper, we propose novel architecture called Shuttle (ShuttleNoC). By permitting packets...
Memory access patterns could leak temporal and spatial information in a sensitive program; therefore, obfuscated memory are desired from the security perspective. Oblivious RAM (ORAM) has been favored candidate to eliminate pattern leakage through randomly remapping data blocks around physical space. Meanwhile, accessing with ORAM protocols results significant bandwidth overhead. For each request, after going obfuscation, main needs service tens of actual accesses, only one real out them is...
With the development of important solution for privacy computing—fully homomorphic encryption (FHE), explosion data size, and computing intensity in FHE applications brings enormous challenges to hardware design. In this article, we propose a novel co-design scheme acceleration named "Poseidon-NDP," which focuses on improving efficiency resource bandwidth. Specifically, investigate special implications imposed by applications. It empirically shows that performance is suffered from both...
In this study, a quasi-finite-time control method for designing stabilising laws is developed high-order strict-feedback nonlinear systems with mismatched disturbances. By using mapping filtered forwarding technique, virtual designed to force the off-the-manifold coordinate converge zero in quasi-finite time at each step of design; same time, manifold rendered insensitive time-varying, bounded and unknown terms standard methodology, algorithm proposed here not only does require Lyapunov...
Summary This work develops a robust adaptive control algorithm for uncertain nonlinear systems with parametric uncertainties and external disturbances satisfying an extended matching condition. method is implemented in the framework of mapping filtered forwarding‐based technique. As attractive alternative backstepping method, this bottom‐up strategy forms virtual controller parameter updated law at each step design, where Lyapunov functions prior knowledge system parameters are not required....
Inference efficiency is the predominant design consideration for modern machine learning accelerators. The ability of executing multiply-and-accumulate (MAC) significantly impacts throughput and energy consumption during inference. However, MAC operation suffers from significant ineffectual computations that severely undermines inference must be appropriately handled by accelerator. are manifested in two ways: first, zero values as input operands multiplier, waste time but contribute nothing...
Networks-on-chip (NoCs), as the communication infrastructure in many-core processors, has demonstrated remarkable power consumption along with technology scaling. However, due to temporal and spatial heterogeneity of on-chip traffic, one critical problem is that NoC cannot effectively adapt variation its traffic intensity, also known localized adaptation, hence yielding a suboptimal efficiency. Prior approaches either resort over-provisioned design or coarse-grained bandwidth scaling...
Many recent excellent methods for efficient real-time semantic segmentation are of low precision and heavily rely on multiple GPUs training. In this paper, we rethink the critical factors affecting accuracy models. The previous works usually reduce input resolution prior to training parameters models by cropping or resizing images. On contrary, our empirical study shows that reduced images lose important content information details, which vital high precision. However, unable train original...
With the development of society, more and goods are sold intelligently without people. The purpose supermarket shopping robots is to help shop assistants replenish free from tedious mechanical work. Based on “Innovation Robot Design Production Competition-Supermarket Challenge Competition”, competition rules taken as design criteria. On basis CNN (convolution neural network), we a image recognition algorithm used in robot. This overcomes problems low accuracy slow speed recognition, enables...
Workload consolidation is widely used in modern cloud processors to reduce total cost of ownership. Performance isolation has be enforced between consolidated workloads achieve controllable quality service. Networks-on-chip (NoCs), as a major shared resource, often incur traffic interference and violate performance criteria. Previous work resorts strict strategy that partitions NoC into independent regions isolate core-to-core communication traffic. However, either results low density or...
Classic DNN pruning mostly leverages software-based methodologies to tackle the accuracy/speed tradeoff, which involves complicated procedures like critical parameter searching, fine-tuning and sparse training find best plan. In this paper, we explore opportunities of hardware runtime propose a methodology, termed as "BitX" empower versatile inference. It targets abundant useless bits in parameters, pinpoints prunes these on-the-fly proposed BitX accelerator. The versatility lies in: (1)...
ABSTRACT The collision avoidance and path‐following problem is fundamental for unmanned surface vehicles (USVs) to accomplish various tasks in different water environments. However, addressing this issue challenging, as USVs are inevitably affected by environmental disturbances practice. In study, we address the robust with unknown bounded using control barrier function (CBF)‐based approach. To reduce conservativeness actions, elliptical shape of USV considered when designing law....