- Advanced Memory and Neural Computing
- Advanced Neural Network Applications
- Advanced Steganography and Watermarking Techniques
- Low-power high-performance VLSI design
- Prosthetics and Rehabilitation Robotics
- CCD and CMOS Imaging Sensors
- Advancements in Semiconductor Devices and Circuit Design
- Muscle activation and electromyography studies
- Advanced Image and Video Retrieval Techniques
- Semiconductor materials and devices
- Stroke Rehabilitation and Recovery
- Chaos-based Image/Signal Encryption
- Video Coding and Compression Technologies
- Digital Media Forensic Detection
- Parallel Computing and Optimization Techniques
- Advanced Vision and Imaging
- Ferroelectric and Negative Capacitance Devices
- Radiation Effects in Electronics
- Adversarial Robustness in Machine Learning
- Neural Networks and Reservoir Computing
- Advanced Image Processing Techniques
- Mechanical Circulatory Support Devices
- Graphene research and applications
- Image and Video Stabilization
- Machine Learning and ELM
Hefei University of Technology
2019-2023
Nanjing University of Aeronautics and Astronautics
2019-2021
Institute of Computing Technology
2019-2021
Chinese Academy of Sciences
2019-2021
Wan Fang Hospital
2006
This article presents a novel reconfigurable torque-controllable variable stiffness actuator (RVSA) for knee exoskeleton. The concept of reconfiguring the pulley block is proposed to make work in different torque and ranges. reconfigurability allows achieve variety passive behaviors, namely softening, linear, hardening behaviors. Compared with existing actuators based on principle changing spring preload, an advantage RVSA that can wider range joint stiffness. Moreover, model established...
Prior works typically conducted the fault analysis of neural network accelerator computing arrays with simulation and focused on prediction accuracy loss models. There is still a lack systematic acceleration system that considers both degradation exceptions, such as stall running overtime. To end, we implemented representative corresponding injection modules Xilinx ARM-FPGA platform evaluated reliability under different rates when series typical models are deployed system. The entire...
Convolutional neural networks (CNNs) have become the state-of-the-art technique in many classification tasks IoT system. However, low-power and area-constraint edge devices are unable to afford expensive cost of CNNs. Resistive random access memory (RRAM) is attractive for establishing CNN accelerator at end due features scalability, in-situ dot-product. mapping a network architecture onto general-purpose RRAM suffers severe issue resource underutilization. The quantization offers an...
Regular 2D computing array is widely utilized for the processing of major neural network operations in many deep learning accelerators (DLAs). Hardware failures on can lead to considerable errors and prediction accuracy loss. Prior works proposed add homogeneous redundant PEs each row or column regular mitigate faulty PEs, but they may fail recover from faults when number a exceeds corresponding column. The problem gets worse are not evenly distributed across array. To address problem, we...
Hardware faults on the regular 2-D computing array of a typical deep learning accelerator (DLA) can lead to dramatic prediction accuracy loss. Prior redundancy design approaches typically have each homogeneous redundant processing element (PE) mitigate faulty PEs for limited region rather than entire avoid excessive hardware overhead. However, they fail recover when number in any exceeds same region. The mismatch problem deteriorates fault injection rate rises and are unevenly distributed....
Purpose Hand motor dysfunction has seriously reduced people’s quality of life. The purpose this paper is to solve problem; different soft exoskeleton robots have been developed because their good application prospects in assistance. In paper, a new hand designed help people conduct rehabilitation training. Design/methodology/approach proposed an under-actuated cable-driven mechanism, which optimizes the force transmission path and many local structures. Specifically, optimized cables are...
With the advancements of neural networks, customized accelerators are increasingly adopted in massive AI applications. To gain higher energy efficiency or performance, many hardware design optimizations such as near-threshold logic overclocking can be utilized. In these cases, computing errors may happen and difficult to captured by conventional training on general purposed processors (GPPs). Applying offline trained network models with directly lead considerable prediction accuracy loss....
The increasing hardware failures caused by the shrinking semiconductor technologies pose substantial influence on neural accelerators and improving resilience of network execution becomes a great design challenge especially to mission-critical applications such as self-driving medical diagnose. reliability analysis is key step understand failures, thus highly demanded. Prior works typically conducted fault with simulation concentrated prediction accuracy loss models. There still lack...
Generative neural network is a new category of networks and it has been widely utilized in many applications such as content generation, unsupervised learning, segmentation, pose estimation. It typically involves massive computing-intensive deconvolution operations that cannot be fitted to conventional processors directly. However, prior works mainly investigated specialized hardware architectures through intensive modifications the existing deep learning accelerate together with...
Robot-assisted cooperative rehabilitation training has shown superiority in helping the individuals with motion impairment problems to regain their motor functions. This paper presents development and evaluation of a new control scheme for an end-effector-type robot provide upper extremity desired compliance intensity. Firstly, overall mechanical structure real-time system are introduced. Secondly, integral fuzzy sliding mode impedance strategy combined time-delay estimation (IFSMIC-TDE) is...
Neural networks especially the convolution neural (CNN) have become prevalent and numerous CNN accelerators been developed to achieve higher performance. While clock frequency determines operation speed has direct influence on performance of accelerators, we propose apply overclocking, a circuit optimization approach that enables frequency, general accelerators. This technique brings significant improvement, but it leads moderate timing errors, wrong computing results low prediction...
Artificial Intelligence of Things (AIoT) processors fabricated with newer technology nodes suffer rising soft errors due to the shrinking transistor sizes and lower power supply. Soft on AIoT particularly deep learning accelerators (DLAs) massive computing may cause substantial errors. These are difficult be captured by conventional training general-purposed such as CPUs GPUs in a server. Applying offline trained neural network models edge directly lead considerable prediction accuracy loss....
Deformable convolution networks (DCNs) proposed to address image recognition with geometric or photometric variations typically involve deformable that convolves on arbitrary locations of input features. The change different inputs and induce considerable dynamic irregular memory accesses cannot be handled by classic neural network accelerators (NNAs). Moreover, bilinear interpolation (BLI) operation, which is required obtain deformed features in DCNs, also deployed existing NNAs directly....
Carbon Nanotube field-effect transistors (CNFET) emerge as a promising alternative to the conventional CMOS for much higher speed and power efficiency. It is particularly suitable building power-hungry last level cache (LLC). However, process variation (PV) in CNFET substantially affects operation stability thus worst-case timing, which limits LLC frequency dramatically given fully synchronous design. To address this problem, we developed variation-aware such that each part of can run at its...
The recent proposed Deformable Convolutional Networks (DCNs)greatly enhance the performance of conventional Neural (CNNs) on vision recognition tasks by allowing flexible input sampling during inference runtime. DCNs introduce an additional convolutional layer for adaptive offset generation, followed a bilinear interpolation (BLI) algorithm to integerize generated non-integer values. Finally, regular convolution is performed loaded pixels. Compared with CNNs, DCN demonstrated significantly...
Deformable convolution networks (DCNs) proposed to address the image recognition with geometric or photometric variations typically involve deformable that convolves on arbitrary locations of input features. The change different inputs and induce considerable dynamic irregular memory accesses which cannot be handled by classic neural network accelerators (NNAs). Moreover, bilinear interpolation (BLI) operation is required obtain deformed features in DCNs also deployed existing NNAs directly....
Carbon Nanotubu field-effect transistor (CNFET) that promises both higher clock speed and energy efficiency becomes an attractive alternative to the conventional power-hungry CMOS cache. We observe CNFET-based cache constructed with typical SRAM cells has distinct consumption when reading/writing 0 1 from/to it. For instance, of writing cell is almost 10X than 0. With this observation, we propose energy-efficient design called CNT-Cache take advantage feature. It predicts line access pattern...